Understanding the Go Runtime: The Scheduler(internals-for-interns.com)
115 points byvalyala3 days ago |5 comments
avabuildsdata22 minutes ago
The unfair scheduling point resonates. I run a lot of concurrent HTTP workloads in Go (scraping, data pipelines) and the scheduler is honestly fine for throughput-oriented work where you don't care about tail latency. But the moment you need consistent response times under load it becomes a real problem. GOMAXPROCS tuning and runtime.LockOSThread help in narrow cases but they're band-aids. The lack of priority or fairness knobs is a deliberate design choice but it does push certain workloads toward other runtimes.
withinboredom2 hours ago
My biggest issue with go is it’s incredibly unfair scheduler. No matter what load you have, P99 and especially P99.9 latency will be higher than any other language. The way that it steals work guarantees that requests “in the middle” will be served last.

It’s a problem that only go can solve, but that means giving up some of your speed that are currently handled immediately that shouldn’t be. So overall latency will go up and P99 will drop precipitously. Thus, they’ll probably never fix it.

If you have a system that requires predictable latency, go is not the right language for it.

melodyogonna1 hour ago
> If you have a system that requires predictable latency, go is not the right language for it.

Having a garbage collector already make this the case, it is a known trade off.

gf0001 hour ago
This may have been practically true for a long time, but as Java's ZGC garbage collector proves, this is not a hard truth.

You can have world pauses that are independent of heap size, and thus predictable latency (of course, trading off some throughput, but that is almost fundamental)

pjmlp2 hours ago
It misses having a custom scheduler option, like Java and .NET runtimes offer, unfortunely that is too many knobs for the usual Go approach to language design.

Having a interface for how it is supposed to behave, a runtime.SetScheduler() or something, but it won't happen.

red_admiral2 hours ago
> If you have a system that requires predictable latency, go is not the right language for it.

I presume that's by design, to trade off against other things google designed it for?

withinboredom2 hours ago
No clue. All I know is that people complain about it every time they benchmark.
desdenova49 minutes ago
> If you have a system, go is not the right language for it.

FTFY

Horos2 hours ago
Isn't a dedicated worker pool with priority queues enough to get predictable P99 without leaving Go?

If you fix N workers and control dispatch order yourself, the scheduler barely gets involved — no stealing, no surprises.

The inter-goroutine handoff is ~50-100ns anyway.

Isn't the real issue using `go f()` per request rather than something in the language itself?

withinboredom40 minutes ago
No. Eventually the queues get full and go routines pause waiting to place the element onto the queue, landing you right back at unfair scheduling.

https://github.com/php/frankenphp/pull/2016 if you want to see a “correctly behaving” implementation that becomes 100% cpu usage under contention.

Horos21 minutes ago
fair point on blocking sends — but that's an implementation detail, not a structural one.

From my pov, the worker pool's job isn't to absorb saturation. it's to make capacity explicit so the layer above can route around it. a bounded queue that returns ErrQueueFull immediately is a signal, not a failure — it tells the load balancer to try another instance.

saturation on a single instance isn't a scheduler problem, it's a provisioning signal. the fix is horizontal, not vertical. once you're running N instances behind something that understands queue depth, the "unfair scheduler under contention" scenario stops being reachable in production — by design, not by luck.

the FrankenPHP case looks like a single-instance stress test pushed to the limit, which is a valid benchmark but not how you'd architect for HA.

pss3149 hours ago
I enjoyed both these GopherCon talks:

GopherCon 2018: The Scheduler Saga - Kavya Joshi https://www.youtube.com/watch?v=YHRO5WQGh0k

GopherCon 2017: Understanding Channels - Kavya Joshi https://www.youtube.com/watch?v=KBZlN0izeiY

c0balt9 hours ago
https://m.youtube.com/watch?v=-K11rY57K7k - Dmitry Vyukov — Go scheduler: Implementing language with lightweight concurrency

This one notably also explains the design considerations for golangs M:N:P in comparison to other schemes and which specific challenges it tries to address.

jvillegasd9 hours ago
Good videos, thanks for sharing!
GeertVL2 hours ago
This is an excellent idea as a blog. Kudos!