How exactly is the Go GC easier to "triage" than that of the JVM? Keep in mind t...

ngrilly · on March 28, 2018

> The difference is that Go relies on escape analysis a lot more than the JVM does, because Go's GC has much slower allocation due to not being generational.

True, but this is only a part of the story. In Java, every custom type is a reference type. This puts a lot of pressure on the GC because a lot of multiple small objects are created. Go has value types and doesn't suffer from this issue, which is a reason why having a generational GC is not that important for Go.

littlestymaar · on March 28, 2018

> This puts a lot of pressure on the GC because a lot of multiple small objects are created.

… but they are allocated in the nursery, which isn't comparable to an allocation in Go (by orders of magnitude). It only makes a difference if you have a long-lived object (which must be tenured) and the difference is noticeable only if the object is small enough to be memcopied multiple times at low cost.

That's why Go is only significantly faster than Java on the Go's marketing slides and not in real life.

ngrilly · on March 28, 2018

I already agreed that allocation is faster in Java than in Go. What I'm saying is that allocation speed is less critical in Go than in Java.

In Java, if you allocate an object containing 10 other objects, you have to allocate 11 objects, because you only only have reference types (I know there is ongoing work to introduce value types). This is more work for the allocator, and for the GC. In Go, you allocate only once.

Honestly, I'm not sure about this issue being very significant for most programs in either Go or Java. As a reminder, most C, C++ and Rust programs don't use a bump allocator either and they're doing well.

littlestymaar · on March 28, 2018

> What I'm saying is that allocation speed is less critical in Go than in Java.

Indeed. In fact, Go couldn't afford not to have a generational GC without value type and Java would be really slow with Go's GC.

> As a reminder, most C, C++ and Rust programs don't use a bump allocator either and they're doing well.

C, C++ and Rust can't use a bump allocator since you need a compacting GC to do so. But allocations are really expensive in these languages and removing them is often the first step of optimization.

yoklov · on March 31, 2018

> C, C++ and Rust can't use a bump allocator since you need a compacting GC to do so.

It's actually very common to do this in performance oriented code. You allocate a decently sized temporary chunk of memory at the start of some period of work (say a frame of a game, or something like that) and then most/all temporary allocations are done from that in a bump pointer fashion. None of it get's freed, but in practice this isn't a big problem as long as you allocate a sufficiently large buffer (or handle overflow by allocating a second)

At the end of the period of work, you free the buffer. In the end you pay for a single malloc/free (and even this you can reduce if you re-use the buffer multiple times), despite having possibly many more allocations.

The downside is that you have some restrictions, you need to be sure none of the objects in the buffer outlives it, in C++ you probably want to ensure all the objects you allocate from it are trivially destructible, etc.

Either way, it's very common to do this sort of thing, at least in C and C++. That's partially because the system allocator is typically slow, but mostly because this sort of control is a big reason to use languages like this.

jashmatthews · on March 28, 2018

> C, C++ and Rust can't use a bump allocator since you need a compacting GC to do so. But allocations are really expensive in these languages and removing them is often the first step of optimization.

Not true. You can use exactly the same techniques in C, C++ and Rust. It's called arena based allocation.

Even malloc calls can be very cheap. The Hoard allocator uses bump pointer allocation for malloc into empty pages!

ngrilly · on March 28, 2018

> Go couldn't afford not to have a generational GC without value type

And Java couldn't afford not to have value types without a generational GC ;-)

> C, C++ and Rust can't use a bump allocator since you need a compacting GC to do so. But allocations are really expensive in these languages and removing them is often the first step of optimization.

Agreed. This is why Go programmers use the same optimizations as C, C++ and Rust programmes is such a case (by allocating from a pool or an arena).

Are you aware of cases where what is gained thanks to the bump allocator is lost because of compaction?

littlestymaar · on March 28, 2018

> Agreed. This is why Go programmers use the same optimizations as C, C++ and Rust programmes is such a case (by allocating from a pool or an arena).

Yes, unfortunately in Go allocations are harder to avoid[1] than in C++ or Rust, because Go rely on escape analysis whereas objects must be manually boxed in the other languages.

[1]: https://groups.google.com/forum/#!topic/golang-nuts/Vcgx7hkh...

ngrilly · on March 28, 2018

It's not "unfortunate". It's a known drawback of the tradeoff chosen by Go.

pcwalton · on March 28, 2018

Which can be effectively mitigated by having a generational garbage collector!

pcwalton · on March 28, 2018

> Go has value types and doesn't suffer from this issue, which is a reason why having a generational GC is not that important for Go.

Go is in the same camp as C#/.NET here, where the generational hypothesis certainly holds and therefore .NET uses a generational garbage collector. I am certain that the generational hypothesis holds for Go as well.

Not having a generational GC is a design mistake in Go. Google should fix that.

ngrilly · on March 28, 2018

If you are interested by this topic, here is a relevant discussion on golang-nuts: https://groups.google.com/d/topic/golang-nuts/KJiyv2mV2pU/di...

One of the arguments mentioned against having a generational GC is that it would make concurrent GC with low latency harder (because Go permits interior pointers, unlike Java) and slower (because Go only needs write barrier, unlike Java which needs read barrier as far as I know).

pcwalton · on March 28, 2018

Ian Lance Taylor is wrong in that thread. Interior pointers don't make anything harder: this is a solved problem with card marking. Indeed, .NET has interior pointers, and it doesn't cause any problem at all.

Also, I don't believe that the HotSpot GC needs to use read barriers. Read barriers are only necessary if you're concurrently compacting objects (like Azul C4 does). If you have concurrent mark-and-sweep and stop the world only during compaction phases, read barriers are unnecessary.

ngrilly · on March 28, 2018

I didn't know about .NET permitting interior pointers. Thanks.

I think you're right about HotSpot GC not using read barriers. I was mixing it up with the Azul GC, which uses read barriers.

> If you have concurrent mark-and-sweep and stop the world only during compaction phases, read barriers are unnecessary.

Do you know if it's possible to bound the pause caused by the compaction phase to some ceiling, like 2 ms for example? I'm asking because it's a goal of Go' GC to limit GC pauses.

ngrilly · on March 29, 2018

Here is an excerpt from the recent proposal on non-cooperative goroutine preemption [1], relevant to our discussion on GC and interior pointers:

> Many other garbage-collected languages use explicit safe-points on back-edges, or they use forward-simulation to reach a safe-point. Partly, it's possible for Go to support safe-points everywhere because Go's GC already must have excellent support for interior pointers; in many languages, interior pointers never appear at a safe-point. > [...] > Decoupling stack-move points from GC safe-points. [...] Such optimizations are possible because of Go's non-moving collector.

[1] https://github.com/golang/proposal/blob/master/design/24543-...

vitalyd · on March 28, 2018

G1 has read barriers for SATB marking. Whether a read barrier is used or not is a function of which GC is used so can’t really say “Hotspot doesn’t use read barriers”.

pcwalton · on March 29, 2018

Do you have a reference on G1 read barriers? I believe you, but I can't find any information about that from a Google search.

I don't immediately see why you'd need a read barrier for snapshot-at-the-beginning. You only need to trace references from gray objects to white objects that were deleted, which is a write.

vitalyd · on March 29, 2018

Sorry, that was my mistake - it has pre-write barriers for SATB but not actual read barriers.

However, ZGC will have read barriers and if Shenandoah ever gets integrated into Hotspot, it has them too.

ngrilly · on March 28, 2018

> Indeed, .NET has interior pointers, and it doesn't cause any problem at all.

Are you sure about this (without using C# unsafe context)? I've spent a moment digging in C# documentation and been unable to confirm this.

pcwalton · on March 28, 2018

Yes. The "ref" keyword allows taking interior pointers to data members as function arguments [1]. There are also local references [2].

[1]: https://docs.microsoft.com/en-us/dotnet/csharp/language-refe...

[2]: https://docs.microsoft.com/en-us/dotnet/csharp/language-refe...

vitalyd · on March 28, 2018

.NET doesn’t have interior pointer. Any `ref` must be on the stack and that’s tracked in the stackmap. You cannot have a ref as a field.

pcwalton · on March 29, 2018

.NET doesn't have heap interior pointers (today), but that doesn't matter for this argument. You still need to be able to mark objects as live even if they're only referenced by interior pointers.

ngrilly · on March 29, 2018

> .NET doesn't have heap interior pointers (today), but that doesn't matter for this argument.

I think it matters for this argument.

You are arguing that designing a garbage collector which is concurrent, low latency (pauses < 5 ms), compacting, generational, and supports interior pointers, is easy.

You mentioned C#/.NET as an example, but C# doesn't have interior pointers (from heap to heap, not from stack to heap which is possible).

As far as I know, neither Java, .NET, Go, Haskell, OCaml, D, V8 or Erlang satisfies all these requirements at the same time.

I'm not saying it's impossible, and Ian Lance Taylor in the thread I linked earlier is not saying either. I'm just saying it's certainly hard.

If it would be so easy, then most languages would already have interior pointers and a concurrent, low latency, compacting, generational GC. That's the whole difference between "today" (as in your comment) and "tomorrow".

jashmatthews · on March 28, 2018

> In Java, every custom type is a reference type.

I'm not sure this is true in the actual native code produced by the JIT. HotSpot does autobox elimination.

vitalyd · on March 28, 2018

No, it’s sadly true. EA may scalarize the allocation but this optimization falls apart very easily in Hotspot.

weberc2 · on March 28, 2018

> How exactly is the Go GC easier to "triage" than that of the JVM?

Well, firstly the language has value semantics, so in many cases you can read the source code knowing nothing about escape analysis and make inferences about where data lives. With Java, you're at the mercy of escape analysis. Further, Go's escape analysis seems simpler; I've been programming in Java on and off for a decade and I depend on profiling to hunt down allocations (except in very simple cases). This isn't the case in Go despite working with it for only a few years recreationally.

> This goes away eventually

I keep hearing that, but I'm 4 years into Rust and I'm still not able to be reasonably productive. I'm sure it's possible to get there, but I doubt I will ever be as productive in Rust as I am in Go for the class of applications I write.

pcwalton · on March 28, 2018

> Well, firstly the language has value semantics, so in many cases you can read the source code knowing nothing about escape analysis and make inferences about where data lives.

Are you referring to heap vs. stack? That is not as simple as it seems, because of escape analysis. The only thing that Go lets you do that Java doesn't is put structs inside other structs, which is nice, but it's not a game-changer.

> Further, Go's escape analysis seems simpler; I've been programming in Java on and off for a decade and I depend on profiling to hunt down allocations (except in very simple cases). This isn't the case in Go despite working with it for only a few years recreationally.

I don't see how the HotSpot JVM escape analysis can be "simpler" than that of Go. They work the same way.

weberc2 · on March 28, 2018

> Are you referring to heap vs. stack? That is not as simple as it seems, because of escape analysis. The only thing that Go lets you do that Java doesn't is put structs inside other structs, which is nice, but it's not a game-changer.

This isn't true. I can get a contiguous slice of structs, and I can pass value types by copy without needing to reason about escape analysis at all. If I do `var foo Foo` (assuming Foo is a struct type), I know that `foo` won't escape unless I return a pointer to it (or to something in it), at which point it's subject to escape analysis. I can get quite a long way in Go without needing to reason about escape analysis at all.

> I don't see how the HotSpot JVM escape analysis can be "simpler" than that of Go. They work the same way.

I'm pretty sure they don't work the same way, but I don't have any examples off hand to share (it's been a while since I worked with Java). I think Java's escape analysis is more sophisticated.