AI Update
The latest in AI, LLM, and ML news
AI Update
September 15, 2024
October 2024
- HBR – AI Can (Mostly) Outperform Human CEOs A fascinating comparison of human vs. AI. “Rather than fully replacing human CEOs, AI is poised to augment leadership by enhancing data analysis and operational efficiency, leaving humans to focus on long-term vision, ethics, and adaptability in dynamic markets. The future of leadership will likely be a hybrid model where AI complements human decision-making.”
- AI-Powered Invention Machine “The software’s primary purpose is to scan the literature in both the company’s field and in far-off fields and then suggest new inventions made of old, previously disconnected ones.”
- Generative AI’s Act 01 Two years into the Generative AI revolution, research is progressing the field from “thinking fast”—rapid-fire pre-trained responses—to “thinking slow”—reasoning at inference time. This evolution is unlocking a new cohort of agentic applications.
- AI in Organizations: Some Tactics People are using AI at work. They are seeing productivity gains. Yet, AI use that boosts individual performance does not always translate to boosting organizational performance for a variety of reasons. So how do you do R&D on ways of using AI?
- HBR – Gen AI Makes Legal Action Cheap — and Companies Need to Prepare I’ve seen a bunch of startups targeting either compliance or identification of litigation. Interesting times.
- AI Tool Cuts Unexpected Deaths in Hospital by 26%, Canadian Study Finds
September 2024
Business
- Evolution of SaaS Pricing with AI Seats purchased is going down as AI is added – what to do?
- AI at Work is Here – Now Comes the Hard Part Great study by Microsoft
- AI’s Impact in Hospitality and Travel
Good ideas for every industry- AI’s Impact on BPOs is Real
- Small Teams, Big Impact: How AI Is Reshuffling The Future Of Work?
- How Dwarkesh Patel Uses AI
- AI Summer – LLMs might also be a trap: they look like products and they look magic, but they aren’t.
- Customer Research Using AI
Technical
- GPT 5 – Everything You Need to KnowA very long article, but worth the read if your browser can handle it
- Auto and Meta Evaluation
- LLM as Judge
- Building a Generative AI Platform
- Lynx Hallucination Detection
- OpenAI Structured Outputs
finally reliable JSON output- Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
General-purpose zero-shot prompt to rate responses from an LLM to a given question on a scale from 1-10. They find that GPT-4’s ratings agree as much with a human rater as a human annotator agrees with another one (>80%)July 2024
Business
- State of GenAI in SaaS Startups
- AI Productivity Gains
- State of AI in Production
AI adoption is still early stage. Investment continues to grow.
- AI Voice Agents – State of Market
- C3.AI
Earnings Call
C3.AI gave the example of one of their customers — DLA Piper, a top law firm —moving tasks as knowledge-intensive as due diligence of contracts to AI and gaining an 80% reduction in effort.
- Klarna AI assistant handles two-thirds of customer service chats in its first month
- The
jobs being replaced by AI – an analysis of 5M freelancing jobs
The number of writing jobs posted on Upwork have declined by 33% since the arrival of ChatGPT.
Technical
- What We
Learned from a Year of Building with LLMs (Part I): Technical
Great series of articles
- What We Learned from a Year of Building with LLMs (Part II): Product and Team
- What We Learned from a Year of Building with LLMs (Part III): Strategy
- LLM From the Trenches: 10 Lessons Learned Operationalizing Models at GoDaddy
- What is an Agent
- Levels of AI Agents
- Mixture of Agents
- Why we no longer use LangChain for building our AI agents
- Cost Of Self Hosting Llama-3
8B
It’s more expensive unless you do it on your own hardware, but even then it’s a 5 year payback period.
May 2024
Business
- Economic Impact of Generative AI
- Impact of Generative AI on Work
- AI leads a service-as-software paradigm shift
- Unicorn Club – Enterprise SaaS Leads the Way
- ML, AI and Data (MAD) Landscape
- Every federal agency now needs to have a chief AI officer
- AI Startups Require New Strategies
- Way Enterprises Are Building and Buying Generative AI
Technical
- RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing
- Agent strategies to improve results
- Agent design strategies
- Unusual Prompts
- Offering a tip to LLM improves results
- Prompting structure and setting up the LLM for success
- Generating and using rationales to improve response quality
- Multi Agent Debate
- Meta Prompting
- Meta-Reasoning over Multiple Chains of Thought
- Deceiving to Enlighten: Coaxing LLMs to Self-Reflection for Enhanced Bias Detection and Mitigation
- RAG Complexity and Evaluation
- Great overview of Evals
- Search-Augmented Factuality Evaluator (SAFE)
- Fine grain, atomic evaluation of factual precision in long form text
- Evaluating LLM answers
- Taxonomy of hallucinations
- Divide and conquer evaluations of larger outputs to look for hallucinations / consistency
- Information at beginning of context window is lost when larger context are passed in
- Lost recall in large context windows
- LLM as Optimizer of Prompts
- Automated optimization of prompts
- Lowering LLM costs
- AI’s Impact in Hospitality and Travel