ollama run phi4
MIT License
Bryan Johnson: God didn't create us, we are creating God in our own image by building superintelligent AI
At first glance, I was confused as to how they had managed to get so many celebrities together.
Then I realized it.
Creds to aininja_official on instagram (not on X)
Resisting the urge to say it, but the answer is obvious.
gm
CES report #22:
Grok soon will have voice.
But will it be this good?
Handle multiple voices in the car? And be able to talk with each in real time or just the driver?
Lock onto the driver’s voice while there is a screaming baby in the back seat?
Will Grok’s latency/speed be this good?
Can it handle multiple people talking in different languages?
@Speechmatics
is the industry’s best at the moment. They believe they have a two-year lead over Tesla.
We will find out in a few weeks.
This is hugely important for Robotaxi networks and robots.
This paper from DeepMind is blowing my
mind:
“Our findings reveal that models fine-tuned on weaker & cheaper generated data consistently outperform those trained on stronger & more-expensive generated data across multiple benchmarks…”
I tried to see how Kling v1.6 would handle the trolley problem.
But it just backed away slowly.
Edward Norton says AI will never write songs of the caliber of Bob Dylan: "you can run AI for a thousand years, it's not going to write Bob Dylan's songs"
On certain real-world clinical diagnosis benchmarks, physicians assisted by GPT-4 perform worse than GPT-4 on its own
https://arxiv.org/abs/2412.10849
you’re either cracked or you’re cooked
LORD JENSEN DRIPPED OUT TECHNOLOGY BROTHER
do you think more paradigms are needed to form agi?
or just scaling these 3?
Project DIGITS could run 4o locally thats crazy
Jensen is the new Steve Jobs. Meaning he's genius at marketing and turned the nvidia brand into a religion
ARE YOU NOT ENTERTAINED??
If you treat coding as a way to interact with the world, then you might believe that coding agent is all you need for 2025.
AGI is coming in 2025. How we feelin' about this?
Releasing METAGENE-1: In collaboration with researchers from USC, we're open-sourcing a state-of-the-art 7B parameter Metagenomic Foundation Model.
Enabling planetary-scale pathogen detection and reducing the risk of pandemics in the age of exponential biology.
Hot take: if you're NOT using AI to at least brainstorm or do a first draft of your work (fiction or nonfiction) you're falling behind.
the stockfish moment has arrived
on some tasks, modern ai isn’t just better than a human, but better than human + ai working together
Here's a 6 word story for claude
Message limit reached for Claude 3.5
For your information, Sam Altman defines Superintelligence as the phase where AI can rapidly increase the rate of scientific discovery
(Via Bloomberg)
OpenAI is also straight shooting for superintelligence now.
superintelligence, in the true sense of the word
insane thing: we are currently losing money on openai pro subscriptions!
people use it much more than we expected.
First look at SPARTA, a distributed AI training algorithm that avoids synchronization by randomly exchanging sparse sets of parameters (<0.1%) asynchronously between GPUs.
Preliminary results and details of our mac mini training run are available now (link below).
SPARTA achieves >1,000x reduction in inter-GPU communication, enabling training of large models over slow bandwidths without specialized infrastructure.
SPARTA works on its own but can also be combined with sync-based low communication training algorithms like DiLoCo for even better performance.
Giving an AI agent access to my iPhone.
Your phone knows you better than anyone. What if your AI agent could go through your phone to truly understand you?
The EXO Agent uses iPhone mirroring to look through your apps including YouTube/Netflix watch history, X likes and photos. With every swipe, it learns who you are.
No logins or APIs. Just your phone, mirrored.
Available in preview with the EXO Desktop App (link below).
OpenAI has been actively developing Project "Perhonen" for an extended period.
Ghost writers, bloggers, and copywriters have been getting executed left and right for at least 18 months already
2025 we attack business processes, accountants, human ressources, banking and corporate paper pushers
2026-2027 the robots start making a massive impact
Figure is already deploying at BMW and China is rolling a bunch out already ...if ppl thought Amazon warehouses were next level. Wait till it becomes commoditized and hits "all" warehouse
Also during the same timeline, in any case, before 2030, you have all the autonomous vehicles coming to "first" replace all the taxing, public transport, and merchandise drivers
2030-2035 will be unrecognizable compared to today
... it's amazing and scary simultaneously
Caught a succubus
Right now, the shoggoths are still ~human level at persuasion, but soon they'll have "manipulation superpowers"
2-4 years from now, imagine this:
There are BILLIONS of them roaming the internet...
...talking with MILLIONS of humans simultaneously...
...clicking, programming, writing, and thinking at speeds MILLIONS of times faster than mere humans can operate...
...self improving exponentially...
Do you see how out of control that could get?
Geoffrey Hinton: "They'll be able to manipulate people. They'll be very good at convincing people, because they'll have learned from all the novels ever written, all the books by Machiavelli, all the political connivances..."
What if we could connect all the dark compute across the globe to build the world's biggest AI data center?
Most of the compute in the world is dark: phones, laptops, Tesla's, PS5's, TV's. These devices have powerful GPUs but are mostly sitting idle.
Today, EXO Labs is announcing the first step in activating all of this dark compute for AI workloads: evML, a distributed computing protocol that enforces honest behavior through hardware security and spot-checks with just 5% performance overhead. Version 0.0.1 of evML is compatible with Apple, Samsung and LG devices.
Our preliminary report on evML is now available (link below).
Transformer: Attention is all you need paper from 2017 used roughly 1e19 flops.
Today most labs have a tonne more, like more than a million times more compute.
You would expect with right focus and people we should at-least had similar level breakthroughs in architectures.
32 year old
@ylecun
shows off the world's first Convolutional Network in 1993 for Text Recognition.
is that "agentic" enough
2024 was the year of the sweet lesson, where a colossal load of money was burned to make hilariously bad models, and the 2 nerd companies with clever, creative approaches killed it
This is the most gut-wrenching blog I've read, because it's so real and so close to heart. The author is no longer with us. I'm in tears. AI is not supposed to be 200B weights of stress and pain. It used to be a place of coffee-infused eureka moments, of exciting late-night arxiv safaris, of wicked smart ideas that put smile on our faces. But all the incoming capital and attention seem to be forcing everyone to race to the bottom.
Jensen always tells us not to use phrases like "beat this, crush that". I absolutely love this perspective. We are here to lift up an entire ecosystem, not to send anyone to oblivion. I like to think of my work as expanding the pie. We need to bake the pie first, together, the bigger the better, before dividing it. It gives me comfort knowing that our team's works moved the needle for robotics, even just by a tiny bit.
AI is not a zero sum game. In fact, it is perhaps the most positive-sum game that humanity ever plays. And we as a community should act this way. Take care of each other. Send love to "competitors" - because in the grand schemes of things, we are all coauthors of an accelerated future.
I never had the privilege to know Felix irl, but I loved his research taste and set up Google Scholar alert for every one of his new papers. His works in agents and VLMs had a big influence on mine. He would've been a great friend. I want to get to know him, but I couldn't any more.
RIP Felix. May the next world have no wars to fight.
When I watched Her, it really bothered me that they had extremely advanced AI and society didn't seem to care. What I thought was a plot hole turns out to be spot on
Elon buddying up with DeepMind because he hates sama that much is a wild twist of fate
1-800-ChatGPT might seem like a silly gimmick, but the underlying principle is critical to scaling AI adoption.
The next billion AI users will not be on the existing UX’s, they will be using text, email, and voice.
Whoever lands this experience is going to win in a huge way.
gemini is really underrated btw.
AI Agent frameworks in a nutshell
ChatGPT is talking to 100,000+ people this moment
Samantha from Her was talking to 8,316
The real power of DeepSeek V3 API direct calls.
1000 parallels calls for 1000 tests in 7 seconds!
No speed up in the video! Normal speed!
Yesterday I tried through OpenRouter and I was able to do 200 parallel calls for 500 tests in 10 seconds
Seems there are two different philosophies at work here:
Kling is creating a "view"
- straightforward, simulacra
Runway is creating a "vision"
- cinematic, aesthetic
The picture (Kling) and the picture of the picture (Runway)
I would choose Kling for an immersive product
It was done. Officially, I recognize that we have a model that beat this benchmark.
@PrimeIntellect
and
@exolabs
are doing the most important work for humans, no jokes.
@GroqInc
deserves an honorable mention, as always.
o1-preview shows a whopping 30% reduction in accuracy when Putnam math problems are slightly variated.
Not sure about the reliability of the results but I think it's a good paper worth checking out to understand LLM robustness on complex math problems better.
A hot research topic right now is to understand better whether these models can "reason" robustly or if they are just relying on memorization.
Robustness is important as it's key to model reliability.
AI will punish those who aren't concise.
The Llama run on Windows98 reached escape velocity out of tpot
What device should we do next? GameBoy? Xbox 360? PS2?
What could you do with 55 Mac Mini's?
We have Nokia Intelligence before GTA VI
DeepSeek V3 is now live in the Arena
Congrats
@deepseek_ai
on the impressive release, matching top proprietary models like Claude Sonnet/GPT-4o across standard benchmarks.
Now it's time for the human test - come challenge it with your toughest prompts at lmarena and stay tuned for leaderboard release!
2024 wrapped
- 34 VC rejections
- a few investors said yes including
@naval
- 1 co-founder quit
- pivoted 3 times
- launched repo, hit #1 trending in the world and 18k stars
- backdoor attempt on repo through innocent looking pr (north korea?)
- approached twice for acquisition
- twitter hacked
- world's first run of llama 405B and deepseek v3 671B on macs
- featured by meta at meta connect keynote
- at least 10 crypto scams using exo's name
- 569 job applicants, only hired 1 so far
Write your own private AI benchmarks. Don't outsource your model / AGI evaluation to the crowd.
PSA: our experimental Gemini models are free (in Google AI Studio and API), 10 RPM, 4M TPM, 1500 RPD.
Enjoy the most powerful models we have to offer (2.0 flash, thinking, 1206, etc), with just 3 clicks on:
It's shocking how Mistral was THE open weight alternative to greedy OpenAI and Anthropic last year and they lost it. I think this will eventually be taught in business schools how to miss your unique chance when you hold all the cards.
"I am gonna build a vertical AI Agent startup in 2025"
Why is the conclusion an “AI bust”?
First versions don’t always work.
Innovator’s dilemma is real.
And LLMs are getting better…for cheaper.
Sounds awesome.
Controlling my AI assistant from the indestructible Nokia 3310
Sends SMS directly to AI home cluster with access to all my files, emails, calendar and other apps.
Try it yourself, link below.
Being a founder is like getting punched in the face 99 times and still hoping the 100th time is a kiss instead
I've also used 150m tokens of DeepSeek in the last few days... for $15. With no rate limits, I am routinely running 50+ threads against it.
Python is now the No. 1 language on GitHub surpassing JavaScript.
I should finally buy a 4090
all chatbots are retarded on edge cases and this has barely changed in the last two years
In the last 24 hrs, have gotten calls from two different founders in desperate need of 5x more reliable inference compute than they budgeted for this month. The inference supply crunch for code models is real. Absolutely bonkers.
gm. 2025. let’s cook
ARE YOU SMARTER THAN THE AVG CS MAJOR?
Introducing Leetle. (leetle DOT app)
Leetcode meets Wordle.
6 test cases. 6 attempts. + it’s timed.
Daily Python Challenges. Forever.
Optimized for Mobile Coding.
Your first challenge? FizzBuzz.
At the beginning of 2024, uv didn’t even exist. Feels like a lifetime ago.
I need to write a blog for the opensource journey of Qwen in 2024.
𝐘𝐞𝐬, 𝐂𝐥𝐚𝐮𝐝𝐞'𝐬 𝐟𝐨𝐧𝐭 𝐢𝐬 𝐚 𝐰𝐚𝐫 𝐜𝐫𝐢𝐦𝐞
Before LLMs, one of my biggest frustrations was not getting answers to 'why' questions. Even in college, I remember asking why I need to make files executable and getting just stares or condescending 'so you can execute them' replies. LLMs make it so much easier to get answers to 'why' questions
what if hallucinations are actually a feature, not a bug?
Wait this is fucking insane — Claude immediately guessed I was French.
How can anyone still think these things are stochastic parrot and not reasoning? Do they really think there is much "people guessing what people's native languages are" in the training data?
Self-improving shoggoths
Ex-OpenAI researcher Daniel Kokotajlo expects that in the next few years AIs will take over from human AI researchers, improving AI faster than humans could
You get how insane the implications of that are, right?
WTAF
@GoogleAI
.....
how is...
what the...
but then what happ....
If you're a guy in your 20s, build a chatgpt wrapper.
Go into debt if you have to.
friend is a medical illustrator
few years ago, we were wondering when AI could do her job — getting the details right is important in this use case
since then, she's been noticing more and more news articles and such using AI generated medical images
Note the 3rd forearm bone
One of the best things to happen in 2024 was Google and DeepSeek becoming competitive.
Imagen 3 is prob my fav image model now, and despite having unlimited Sora now I barely use it due to its meh quality.
2025 is a definitive year, no more excuses.
Ok back to my pudding now
i'm a little late to the party here but just read about the NeurIPS best paper drama today. you're telling me that ONE intern
> manually modified model weights to make colleagues' models fail
> hacked machines to make them crash naturally during large training runs
> made tiny, innocuous edits to certain files to sabotage model pipelines
> did this all so that he could use more GPUs
> used the extra GPUs to do good research
> his research WON THE BEST PAPER AWARD
> now bytedance is suing this guy for 1 million dollars?!
> sounds to me like he is a genius
> maybe they should hire him full-time instead
A lot of people don't know the current wave of large models can effectively be traced back to one dude. At the time OpenAI was mostly doing RL. All evidence was that scaling models via parameter counts was pointless. The consensus view in the field BY FAR was that gradient descent was unlikely to lead to anything crazy on it's own and a new paradigm likely required. The order of events I've heard was that Alec Radford basically went off and did the first GPT paper by himself in a corner at Open AI. I think that work was as important as the attention paper. This is also the why the whole 'google invented attention why didn't they capitalize' argument is very wrong - almost no one in the field felt that scaling those methods would work.
“EXO, get me in the mood for the holidays”
A home assistant you can trust. It’s 100% Open-Source and runs locally.
AI that’s aligned with you, not a for-profit company.
The biggest 10 year investment you can make in AI, is learning how it all actually works in the next 6 months.
Use Claude every day.
Download Cursor and program with it.
Experiment with Open Source llms.
Deploy your own AI agent.
I guarantee 90% of you will look back on this period and wish you invested your time into learning and exploring AI.
This is the equivalent of getting into web2 in 2010, or the internet in 1993.
Once a generation opportunity and this tweet is to make sure you don't fumble.
Reflecting back, these were the biggest technical lessons for me in AI in the past five years:
2020: you can cast any language task as sequence prediction and learn it via pretrain + finetune
2021: scaling to GPT-3 size enables doing arbitrary tasks specified via instructions
2022: scaling to GPT-3.5/PaLM size unlocks reasoning via chain of thought
2023: LLMs themselves can be a product a lot of people will use
2024: to push capabilities past GPT-4, scale test-time compute
2025: ??? probably something like “agents work”
I feel like each one was a significant shift in what I should put effort on. I am super curious to hear what other people’s lists of things they learned each year that changed the way they view the field, and how they differ.
Looking back, these things seem obvious in hindsight but were hard to know ahead of time. I definitely learned some things slower than others and it took me time to understand things deeply. For example, some people probably deeply understood the lessons from GPT-3 in 2020 or earlier, or realized the importance of scaling test-time compute way before 2024.
Exciting News from Chatbot Arena
@OpenAI
's o1 rises to joint #1 (+24 points from o1-preview) and
@deepseek_ai
DeepSeek-V3 secures #7, now the best and the only open model in the top-10!
o1 Highlights:
- Achieves the highest score under style control
- #1 across all domains except creative writing
DeepSeek-V3:
- The most cost-effective model in the top-10 ($0.14/1M input token)
- Robust performance under style control
- Strong in hard prompts and coding
Congrats both to the incredible achievements! Looking forward to seeing more breakthroughs in 2025. More analysis in the thread
just realised Claude has only 956 ratings on iOS app store
meanwhile ChatGPT has 102k, it's not even close dawg
He’s a 10 but he always uses a uuid for his database primary key
you can just measure things
Sam Altman pushed out o3 overfitted on benchmarks and he's selling it as AGI
Aider has been quietly delivering what Devin just promised - with far less hype and a $500/month lower price tag
With 14 months of hindsight this seems very plausible to me right now
spotted at 38c3
they killed him man
Just increased my Deepseek V3 requests to 100 parallel threads. Can't believe it works
Straight shot to ASI is looking more and more probable by the month… this is what Ilya saw
Ilya founded SSI with the plan to do a straight shot to Artificial Super Intelligence. No intermediate products, no intermediate model releases.
Many people (me included) saw this as unlikely to work since if you get the flywheel spinning on models / products, you can build a real moat.
However, the success of scaling test time compute (which Ilya likely saw early signs of) is a good indication that this direct path to just continuing to scale up might actually work.
We are still going to get AGI, but unlike the consensus from 4 years ago that it would be this inflection point moment in history, it’s likely going to just look a lot like a product release, with many iterations and similar options in the market within a short period of time (which fwiw is likely the best outcome for humanity, so personally happy about this).
common themes:
AGI
agents
much better 4o upgrade
much better memory
longer context
“grown up mode”
deep research feature
better sora
more personalization
(interestingly, many great updates we have coming were mentioned not at all or very little!)
GROWN UP MODE IS HAPPENING LFG
do you know the illustrations published on Ilya's Instagram (long time ago)?
I agree with
@ylecun
The idea of AI "wanting" to take over comes from misunderstanding what AI is and how it works. The drive to dominate is something that evolved in biological creatures like us because it helped with survival and reproduction in social groups. Machines don’t have that kind of evolutionary baggage.
In humans and other social species, dominance emerged as an adaptive strategy to secure resources, mates, and influence within a group. It’s tied to survival pressures that shaped our evolutionary trajectory. Machines, on the other hand, have no evolutionary lineage, no survival instincts, and no need to "compete" for resources unless explicitly designed to emulate those behaviors.
Humans have things like ambition, greed, or fear because we’re wired that way by evolution. AI doesn’t have hormones or a brain that makes it "feel" anything. It doesn’t wake up one day and decide to take over the world.
AI isn’t going to start acting like a power hungry human unless we somehow make it that way. The fear of AI wanting to take over says more about us projecting our own instincts onto machines than about how AI actually works.