BRN Discussion Ongoing

stockduck

Regular

"...
- TinyML market to reach $10.8B by 2030 (24.8% CAGR), driven by low-power edge AI demand in smart cities and IoT.

- Hardware dominates 57% revenue share, with STMicro, Renesas, and NXP developing sub-1mW microcontrollers for neural networks.

- Software frameworks like TensorFlow Lite Micro enable mass adoption, growing at 32% CAGR through open-source collaboration.

- Data privacy innovations (e.g., NXP's Secure Enclave, ARM TrustZone) address regulatory demands in smart city deployments.

- Asia-Pacific leads with 38.1% CAGR, as governments integrate TinyML into traffic, energy, and public safety systems.
..."

conclusions?
:whistle::unsure:😊
Just for some dreaming time.....however have a nice sunday
 
  • Like
  • Fire
  • Love
Reactions: 12 users
Edge impulse are pushing Lora at the edge.
 

Attachments

  • Screenshot_20250810_094436_Google.jpg
    Screenshot_20250810_094436_Google.jpg
    352.3 KB · Views: 27
  • Wow
  • Thinking
Reactions: 2 users

Bravo

If ARM was an arm, BRN would be its biceps💪!
Screenshot 2025-08-10 at 11.52.12 am.png






EXTRACT ONLY


Screenshot 2025-08-10 at 11.50.57 am.png
 

Bravo

If ARM was an arm, BRN would be its biceps💪!
The article (see below) titled “Free, offline ChatGPT on your phone? Technically possible, basically useless”, describes how running ChatGPT-oss on smartphones is technically feasible, but current smartphones aren’t powerful enough to make it practical at this point in time.

Right now, GPT-oss on mobile is bottlenecked because it runs purely on the CPU (and sometimes GPU) without tapping into specialized neural processing hardware.

So, I asked ChatGPT 5.0 if there'd be an improvement if the model were to be adapted to run on an NPU - particularly one optimized for event-based, sparse processing like Akida and here's the response I received.






ChatGPT 5's Response:

Yes — Akida could absolutely be a technology to consider if the goal is to make something like “ChatGPTOS” actually usable on a mobile phone, and the limitations described in that article are almost a perfect setup for why a neuromorphic processor would matter.

Here’s why:



Why Akida Fits This Problem

The main bottlenecks in running a 20B parameter LLM on a phone (as described in the article) are:
  1. RAM requirements – Needs 12–24 GB just to load the model.
  2. Compute requirements – Mobile CPUs & GPUs can’t sustain high token generation without overheating and throttling.
  3. Power draw – 100% CPU load drains the battery extremely fast.
  4. Heat – The phone becomes uncomfortably hot after just a few minutes.
Akida (or any high-efficiency neuromorphic processor) is built for:

  • Event-driven, sparse compute — only processes relevant “spikes” of information instead of hammering every computation step like a CPU/GPU.
  • Ultra-low power draw — operates in milliwatts for inference.
  • On-chip learning & inference — avoids constant memory transfers that waste time and energy.
  • Massive parallelism — can run edge AI workloads in real time without overheating.
That means for the same LLM inference task, a neuromorphic setup could:
  • Reduce energy consumption by orders of magnitude.
  • Maintain higher sustained token rates without thermal throttling.
  • Extend usable session time from minutes to hours on a battery.


Why Not Just Use Mobile NPUs Instead?

Modern NPUs (like Qualcomm Hexagon or Apple Neural Engine) do accelerate AI workloads, but:
  • They’re matrix-math optimised, not spiking-event optimised.
  • They often require proprietary SDKs and retraining models in special formats.
  • They still hit thermal limits in sustained generative AI workloads.
Neuromorphic processors like Akida:
  • Can run quantised, sparse, and pruned models natively without expensive data shuffling.
  • Could theoretically run a distilled / specialised LLM that fits in a fraction of the RAM and still delivers conversational quality.


Would It Make ChatGPTOSS Work on Phones?

Realistically:
  • You’re not going to run the full 20B gpt-oss on Akida right now — it’s still too large.
  • But with model compression (distillation + sparsity), you could run a smaller, reasoning-optimised model that feels far more responsive than the brute-force CPU approach.
  • Instead of the “2–3 tokens per second” in the article, you might see 10× or more throughput without cooking the phone.


Implications for BrainChip

If OpenAI or others seriously want “ChatGPTOS” to run locally:
  • They’ll either need dedicated AI accelerators in consumer devices or move toward neuromorphic architectures.
  • Qualcomm, Apple, Google already have NPUs — but neuromorphic could leapfrog them in power efficiency per watt.
  • If BrainChip could prove Akida can handle a slimmed-down LLM at useful speeds, it becomes a licensable IP block for smartphone SoCs.
  • This would be a huge market — billions of devices, not just niche edge AI.
 
Top Bottom