When is a slice of any not a slice of any?
When is []any
not []any
in Go? When generics get involved!
Will the below compile? If not, why?
package main
func main() {
one([]any{})
}
func one[S []E, E any](s S) {
two(s)
}
func two(s []any) []any { return s }
See answer 👀
It doesn't compile, with this error:
generics/main.go:9:13: cannot use s (variable of type S constrained by []E) as type []any in argument to two:
cannot assign []E (in S) to []any
It'd be quite reasonable to think it should compile, because:
one
has the type parameter[S []E, E any]
- by substitution,
S = []E
,E = any
, soS = []any
, surely? - therefore we should be able to pass
S
into a function accepting[]any
?
But we know from the error that this reasoning is missing something. But what?
Note:
any
is 'an alias forinterface{}
and is equivalent in all ways', which landed alongside generics.
Why did Go add generics?
It'll be useful to remind ourselves why Go added generics. Without them we couldn't write certain types of functions in
a type-safe way, for example: "reverse a slice". Pre-generics, we'd have to have sacrificed type-safety in some way. Either a function accepting any
(AKA interface{}
) and using reflection, or you might have tried writing a function accepting []any
. I'll discuss just the second approach, reverseOld
:
// technically it would have been []interface{} pre-Go 1.18
func reverseOld(s []any) bool) []any { /* ... */ }
Here's the generic version:
func reverseGeneric[S []E, E any](s S) bool) S { /* ... */ }
The reverseGeneric
version is better for two reasons. First, the reverseOld
approach loses type information: even
when we pass in a slice with a non-any type like []io.Reader
we get back []any
:
readers := []io.Reader{}
// won't compile - we get back []any
var rev []io.Reader = reverseOld(readers)
// with generics we keep the type
var revGen []io.Reader = reverseGeneric(readers)
Secondly, the reverseOld
function can't work with all slices, but reverseGeneric
can. reverseOld
can only accept
slices whose element type is an interface type:
files := []*os.File{&os.File{}}
readers := []io.ReadCloser{&os.File{}}
// works
_ = reverseOld(readers)
// fails: cannot use files (variable of type []*os.File) as type []any in argument to reverseOld
_ = reverseOld(files)
// works
_ = reverseGeneric(files)
_ = reverseGeneric(readers)
This second restriction might not surprise you at all - if so, skip ahead. But if you're thinking something like 'Huh?! But doesn't []any
mean "any slice"?', you are not alone: read on.
[]any
has never meant any slice
Even before generics, []interface{}
(exactly equivalent to []any
) did not mean "any slice".
Values with an interface type in Go are 'boxes' for concrete types, not the concrete types themselves. Go does a lot to hide this from you for convenience, so you might not have realised, especially as some other languages don't need boxes for interface typed values - e.g. TypeScript. In Go, interface-typed values have a static type of a specific interface type. But to enable them to hold different concrete values at runtime they store the contents as pair of pointers, one to the type of the concrete value inside, and one to the value itself:
You should be starting to see that although every concrete value can be place inside a 'box' of the any
/interface{}
, this doesn't imply concrete values are boxed. And therefore []any
, before and after generics, means "a slice of interface boxes". The diagram below shows that the backing array for a slice of three int32
is a completely different size and layout vs the array backing a slice of interface values containing the same int32
values:
The elements of []int32
or []string
are unboxed, so can't be assigned to []any
. []io.Reader
or []interface{ foo() }
have boxed elements, so can be assigned to []any
.
This took me ages to get my head around. If this quick explanation hasn't clicked with you, I've written another post that explains why interfaces work like this in Go (even pre generics).
How are generics implemented?
We've learned type-parameters let us write a much broader set of generic functions than interfaces alone did. Now let's have a look into how Go compiles generic code. After that, we'll have enough to solve our puzzle.
Intuitively, Go's generics can be thought of as working by doing some copy-pasting - called 'instantiation' in the spec - for you. If
you have a function that you want to work on a set of types, it'll 'copy-paste' specialised functions where the type parameters are replaced with specific types. For example, let's look at a function with the signature of our one
function from the
puzzle:
package main
func one[S []E, E any](s S) {}
func main() {
one([]int{})
one([]string{})
one([]any{})
}
The compiler will generate us implementations of the one
function above for all the types assignable to S
(the spec
calls this the "type set"). You'd think for any
that'd be every type, a huge amount to compile! But in reality
Go does this only for the subset of types that we actually pass into one
in our program. The compiled functions will
look a bit like the below (real output, and detailed docs), where the type parameter gets replaced with the specific types we used:
// compiler generates the following implementations for us
func one_ints(s []int) { }
func one_strings(s []string) { }
func one_interfaces(s []interface{}) { }
This should make it clear why S
can't be considered []any
. Two out of three of our compiled function implementations accept
slices of a specific concrete type, which aren't assignable to []any
.
So to correct our reasoning from the start of the post:
one
has the type parameter[S []E, E any]
- as we just learned,
E
can be instantiated with a concrete type (we saw that's a major motivation for generics), which won't have an interface 'box' - since
S
is a slice ofE
which can be concrete,S
is a superset of[]any
.[]any
's type-set only includes slices of interface types, and we know a slice of interface values is a different shape to a slice of concrete types - therefore we can't pass
S
intotwo
, a non-generic function accepting[]any
.
This makes sense! As said, a major motivation for generics was exactly that, writing functions that can work with all slices, including slices of concrete types.
Same interface, different behaviour
Solving this mystery has taught us something important, and perhaps unfortunate: the behaviour of any
, interface{}
and io.Reader
are different if they appear in type-parameters vs elsewhere. Let's look at another way (t any)
and [T any](t T)
differ: type-assertions. In Go, we can use type-assertions to identify the specific concrete type of value in an interface 'box' at runtime. Or, at least, at least if you're not using an interface type in a type-parameter (with what you've learned, perhaps you can figure out why?). For instance:
package main
import ("fmt"; "io"; "os")
func normal(f io.Reader) {
if iv, ok := f.(*os.File); ok {
fmt.Println("unwrapped you!", iv)
}
}
func generic[F io.Reader](f F) {
if iv, ok := f.(*os.File); ok {
fmt.Println("unwrapped you!", iv)
}
}
If you run this you'll see normal
compiles (as *os.File
can be assigned to io.Reader
), but generic
has a compile error:
invalid operation: cannot use type assertion on type parameter
value f (variable of type F constrained by io.Reader)
This is captured in the spec, where the spec states type-assertions can be used only on an expression "of interface type, but not a type parameter". But with what we've learned we can explain why this rule is in the spec. In non-generic code, the value is always boxed in an interface value. In generic code we may use an interface type to constrain the type-parameter, but the value can be concrete at runtime in the instantiated functions. So there's simply no interface 'box' to look inside. You could imagine how Go could implement it, and indeed people have asked, but currently it's not.
Far from basic
We're almost done, but we should consider another new feature of generics, the ability to constrain interfaces with types, rather than just methods:
package main
import "fmt"
type stringableNumeric interface {
// Listing concrete types means only the types in that set
// which also implement all the methods are in intString's type set.
// ~T means types with the underlying type of T, e.g. ~int includes printableInt below
~int | ~float32 | ~float64
fmt.Stringer
}
type printableInt int
func (i printableInt) String() string {
return fmt.Sprintf("%d", i)
}
func main() {
var answer stringableNumeric = printableInt(42)
}
func withStringableNumeric(n stringableNumeric) {}
Again, we run into surprising behaviour when we used a type defined with interface
: we get compile errors for both var answer stringableNumeric
and (n stringableNumeric)
:
error: cannot use type stringableNumeric outside a type constraint: interface contains type constraints
These compile errors demonstrate that although our numericString
interface uses the same interface
keyword as other interfaces, it clearly works differently. Indeed, the spec tell us interfaces are now split into 'general' - those with concrete type constraints - and 'basic' - those with only method constraints:
Interfaces that are not basic may only be used as type constraints, or as elements of other interfaces used as constraints. They cannot be the types of values or variables, or components of other, non-interface types.
The many meanings of interface, after generics
After generics, Go now has 'basic' and 'general' interface types, which share the same keyword, but can't be used in the same places. Further, as we've seen, "basic" interfaces types
behave completely differently when they're used in a type-parameter to anywhere else. Together, I think that creates
at least three distinct behaviours relating to parts of the language relating to the interface
keyword:
Category of interface | Non-generic code | Type parameter |
---|---|---|
Basic | Always boxed value, type-assertions ok | Both boxed/unboxed values, no type-assertions |
General | Not allowed | As above |
Sharing the interface
keyword between generic and non-generic code, and between basic and general interfaces, is a tradeoff. It of course avoids
a new keyword, and sometimes it's intuitive: you can often use generics without understanding the mechanics and it'll
work how you'd expect. But the cost is confusion - in both sense of the word. This was pointed out by some
maintainers during the design
discussions:
I remain concerned that this proposal overloads words (and keywords!) that formerly had very clear meanings — specifically the words type and interface and their corresponding keywords — such that they each now refer to two mostly-distinct concepts that really ought to instead have their own names
I'm not stating this tradeoff was obviously bad. But it certainly has curious implications!
Thanks to Andy Appleton, Hayden Faulds, Juan Alberto Sanchez and Vik Tomas for reading and providing feedback.
Leave a comment via Github 💬