Member Only Content
To access all features, please consider upgrading to full Membership.
AI Ecosystem Intelligence Explorer
21 of 229 articles
GitHub - smartaces/Anthropic_Claude_Sonnet_3_7_extended_thinking_colab_quickstart_notebook
Contribute to smartaces/Anthropic_Claude_Sonnet_3_7_extended_thinking_colab_quickstart_notebook development by creating an account on GitHub.
GitHub - dzhng/deep-research: An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models. The goal of this repo is to provide the simplest implementation of a deep research agent - e.g. an agent that can refine its research direction overtime and deep dive into a topic.
An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models. The goal of this repo is to provide the s…
Dust Off That Old Hardware and Run DeepSeek R1 on It
No A100 GPU? No problem! You can use exo to combine old laptops, phones, and Raspberry Pis into an AI powerhouse that runs even DeepSeek R1.
Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges
Large language models often struggle with length generalization and solving complex problem instances beyond their training distribution. We present a self-improvement approach where models iteratively generate and learn from their own solutions, progressively tackling harder problems while maintaining a standard transformer architecture. Across diverse tasks including arithmetic, string manipulation, and maze solving, self-improving enables models to solve problems far beyond their initial training distribution-for instance, generalizing from 10-digit to 100-digit addition without apparent saturation. We observe that in some cases filtering for correct self-generated examples leads to exponential improvements in out-of-distribution performance across training rounds. Additionally, starting from pretrained models significantly accelerates this self-improvement process for several tasks. Our results demonstrate how controlled weak-to-strong curricula can systematically teach a model logical extrapolation without any changes to the positional embeddings, or the model architecture.
Understanding Reasoning LLMs
Methods and Strategies for Building and Refining Reasoning Models
Li Fei-Fei’s Team Trains AI Model for Under $50, Revolutionizing Industry Standards
The breakthrough innovation by Li Fei-Fei’s team, which successfully trained a new model, S1, with a cloud computing cost of less than $50, has sparked a reevaluation of the development costs associated with artificial intelligence. This achievement is remarkable, given that S1’s performance in mathematical and coding ability tests is comparable to that of top-tier models like OpenAI’s O1 and DeepSeek’s R1. The research, conducted by Li Fei-Fei and her colleagues from Stanford University and the University of Washington, demonstrates that with careful selection of training data and the application of distillation techniques, it is possible to create highly competent AI models at a fraction of the cost typically associated with such endeavors.
Deep Dive into LLMs like ChatGPT
This is a general audience deep dive into the Large Language Model (LLM) AI technology that powers ChatGPT and related products. It is covers the full traini…
FineWeb: decanting the web for the finest text data at scale - a Hugging Face Space by HuggingFaceFW
Discover amazing ML apps made by the community
Importing Phantoms: Measuring LLM Package Hallucination Vulnerabilities
Large Language Models (LLMs) have become an essential tool in the programmer’s toolkit, but their tendency to hallucinate code can be used by malicious actors to introduce vulnerabilities to broad swathes of the software supply chain. In this work, we analyze package hallucination behaviour in LLMs across popular programming languages examining both existing package references and fictional dependencies. By analyzing this package hallucination behaviour we find potential attacks and suggest defensive strategies to defend against these attacks. We discover that package hallucination rate is predicated not only on model choice, but also programming language, model size, and specificity of the coding task request. The Pareto optimality boundary between code generation performance and package hallucination is sparsely populated, suggesting that coding models are not being optimized for secure code. Additionally, we find an inverse correlation between package hallucination rate and the HumanEval coding benchmark, offering a heuristic for evaluating the propensity of a model to hallucinate packages. Our metrics, findings and analyses provide a base for future models, securing AI-assisted software development workflows against package supply chain attacks.