Papers | Joshua Berkowitz

2 Articles

attention ×

Defeating Nondeterminism In LLM Inference

Reproducible outputs at temperature 0 should be straightforward in principle, the sampler always picks the highest probability token, yet production LLM endpoints still produce different completions f...

attention batch-invariance determinism gpu-kernels llm-inference

Sep 23, 2025

0 38841

EpMAN Reweights Attention With Episodic Memory To Tackle 256k-Token Contexts

Long-context reasoning is still a weak spot for many large language models, even as context windows grow. The ACL 2025 paper EpMAN: Episodic Memory AttentioN for Generalizing to Longer Contexts ...

ACL 2025 attention episodic-memory LLM long-context RAG

Sep 17, 2025

0 4884

Our latest content

Check out what's new !

See all

Ads

Prompt Maker Image Generator

Struggling with the perfect AI image prompt? My free app helps you generate brilliant ideas and instantly creates an image to match. Go from concept to creation in two clicks!

Try It

Most Popular Articles

Check out what the hot topics are!

See all

Follow us

Our latest content

Prompt Maker Image Generator

Most Popular Articles

Every shirt tells a story—and every story

#ClothingForACause