AI Update
The latest in AI, LLM, and ML news
AI Update
January 20, 2025
January 2025
- LLMs May Have a Killer Enterprise App: ‘Digital Labor’ — at Least If Salesforce Agentforce Is Any Indicator While this is a bit hyped and reliability of agents is going to be a big question, it’s still indicative of what’s happening.
- Salesforce Closed 200 Deals for Agentforce in Just One Quarter The company now expects to bring in $37.8 to $38 billion in revenue this fiscal year, an 8-9% increase over the previous year, largely due to AI product sales.
- Benioff: Salesforce Customers Will Deploy One Billion AI Agents Next Year Benioff predicts AI agents will allow companies to have an unlimited workforce.
- A System of Agents Brings Service-as-Software to Life – $4.6 Trillion Opportunity as AI Transforms Software
- What Can Generative AI *Actually* Do Well? Starts a bit overly skeptical but ends with a deep dive into different use cases, including Generation, Transformation, Summarization, Decision, Comparison, Retrieval, Digital Navigation, Physical Navigation, and Behavior Simulation.
- Example of Generative AI ROI – Supply Chain Document Processing This sounds similar to what we’re doing at Affineon, but in our case, the workflow is a doctor’s inbox.
- Ben Evans – AI Eats the World Another amazing presentation by Ben Evans. Some key points: Is the only moat for LLMs capital? AI gives you infinite interns. And many more.
- Menlo VC – State of Generative AI in the Enterprise
- HBS – AI Companions Reduce Loneliness
- Recommendations on How to Use AI from an Executive
- Slack Survey – Reality Check on AI Usage at Work 61% of desk workers have spent less than 5 hours learning AI. Nearly half of desk workers would be uncomfortable telling their managers they used AI.
- The AI Services Wave: Lessons from Palantir in the New Age of AI
- ByteDance Lays Off Hundreds of TikTok Employees in Shift to AI Content Moderation
- Why Aren’t More Workers Using ChatGPT? Great breakdown of bottom-up strategies to help adoption. Paradoxically, most people don’t have time to figure out how they can save time.
- Wharton – AI Adoption Report A survey of over 800 senior business leaders found that weekly Gen AI usage nearly doubled from 37% in 2023 to 72% in 2024.
October 2024
- HBR – AI Can (Mostly) Outperform Human CEOs A fascinating comparison of human vs. AI. “Rather than fully replacing human CEOs, AI is poised to augment leadership by enhancing data analysis and operational efficiency, leaving humans to focus on long-term vision, ethics, and adaptability in dynamic markets. The future of leadership will likely be a hybrid model where AI complements human decision-making.”
- AI-Powered Invention Machine “The software’s primary purpose is to scan the literature in both the company’s field and in far-off fields and then suggest new inventions made of old, previously disconnected ones.”
- Generative AI’s Act 01 Two years into the Generative AI revolution, research is progressing the field from “thinking fast”—rapid-fire pre-trained responses—to “thinking slow”—reasoning at inference time. This evolution is unlocking a new cohort of agentic applications.
- AI in Organizations: Some Tactics People are using AI at work. They are seeing productivity gains. Yet, AI use that boosts individual performance does not always translate to boosting organizational performance for a variety of reasons. So how do you do R&D on ways of using AI?
- HBR – Gen AI Makes Legal Action Cheap — and Companies Need to Prepare I’ve seen a bunch of startups targeting either compliance or identification of litigation. Interesting times.
- AI Tool Cuts Unexpected Deaths in Hospital by 26%, Canadian Study Finds
September 2024
Business
- Evolution of SaaS Pricing with AI Seats purchased is going down as AI is added – what to do?
- AI at Work is Here – Now Comes the Hard Part Great study by Microsoft
- AI’s Impact in Hospitality and Travel
Good ideas for every industry- AI’s Impact on BPOs is Real
- Small Teams, Big Impact: How AI Is Reshuffling The Future Of Work?
- How Dwarkesh Patel Uses AI
- AI Summer – LLMs might also be a trap: they look like products and they look magic, but they aren’t.
- Customer Research Using AI
Technical
- GPT 5 – Everything You Need to KnowA very long article, but worth the read if your browser can handle it
- Auto and Meta Evaluation
- LLM as Judge
- Building a Generative AI Platform
- Lynx Hallucination Detection
- OpenAI Structured Outputs
finally reliable JSON output- Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
General-purpose zero-shot prompt to rate responses from an LLM to a given question on a scale from 1-10. They find that GPT-4’s ratings agree as much with a human rater as a human annotator agrees with another one (>80%)July 2024
Business
- State of GenAI in SaaS Startups
- AI Productivity Gains
- State of AI in Production
AI adoption is still early stage. Investment continues to grow.
- AI Voice Agents – State of Market
- C3.AI
Earnings Call
C3.AI gave the example of one of their customers — DLA Piper, a top law firm —moving tasks as knowledge-intensive as due diligence of contracts to AI and gaining an 80% reduction in effort.
- Klarna AI assistant handles two-thirds of customer service chats in its first month
- The
jobs being replaced by AI – an analysis of 5M freelancing jobs
The number of writing jobs posted on Upwork have declined by 33% since the arrival of ChatGPT.
Technical
- What We
Learned from a Year of Building with LLMs (Part I): Technical
Great series of articles
- What We Learned from a Year of Building with LLMs (Part II): Product and Team
- What We Learned from a Year of Building with LLMs (Part III): Strategy
- LLM From the Trenches: 10 Lessons Learned Operationalizing Models at GoDaddy
- What is an Agent
- Levels of AI Agents
- Mixture of Agents
- Why we no longer use LangChain for building our AI agents
- Cost Of Self Hosting Llama-3
8B
It’s more expensive unless you do it on your own hardware, but even then it’s a 5 year payback period.
May 2024
Business
- Economic Impact of Generative AI
- Impact of Generative AI on Work
- AI leads a service-as-software paradigm shift
- Unicorn Club – Enterprise SaaS Leads the Way
- ML, AI and Data (MAD) Landscape
- Every federal agency now needs to have a chief AI officer
- AI Startups Require New Strategies
- Way Enterprises Are Building and Buying Generative AI
Technical
- RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing
- Agent strategies to improve results
- Agent design strategies
- Unusual Prompts
- Offering a tip to LLM improves results
- Prompting structure and setting up the LLM for success
- Generating and using rationales to improve response quality
- Multi Agent Debate
- Meta Prompting
- Meta-Reasoning over Multiple Chains of Thought
- Deceiving to Enlighten: Coaxing LLMs to Self-Reflection for Enhanced Bias Detection and Mitigation
- RAG Complexity and Evaluation
- Great overview of Evals
- Search-Augmented Factuality Evaluator (SAFE)
- Fine grain, atomic evaluation of factual precision in long form text
- Evaluating LLM answers
- Taxonomy of hallucinations
- Divide and conquer evaluations of larger outputs to look for hallucinations / consistency
- Information at beginning of context window is lost when larger context are passed in
- Lost recall in large context windows
- LLM as Optimizer of Prompts
- Automated optimization of prompts
- Lowering LLM costs
- AI’s Impact in Hospitality and Travel