A Conversation with Anders Hejlsberg

Part VII의 Generics in C#, Java, and C++를 보죠. JAVA Generics로도, C++ Generics로도 삽질을 해본상태인만큼 이해가 전과는 다르군요. 먼저, generics의 정의는 ‘타입에 타입 파라미터를 줄 수 있음’으로 인한 ‘parametric polymorphism’ 또는 ‘parameterized types’이라고 할 수 있습니다.

Comparing C# and Java Generics
It is, of course, great that you don’t have to modify your VM, but it also brings about a whole bunch of odd limitations. The limitations are not necessarily directly apparent, but you very quickly go, “Hmm, that’s strange.”

자바는 하위버젼 호환성을 지키려다보니 이상한 현상이 발생한다.

For example, with Java generics, you don’t actually get any of the execution efficiency that I talked about, because when you compile a generic class in Java, the compiler takes away the type parameter and substitutes Object everywhere. So the compiled image for List<T> is like a List where you use the type Object everywhere. Of course, if you now try to make a List<int>, you get boxing of all the ints. So there’s a bunch of overhead there. Furthermore, to keep the VM happy, the compiler actually has to insert all of the type casts you didn’t write. If it’s a List of Object and you’re trying to treat those Objects as Customers, at some point the Objects must be cast to Customers to keep the verifier happy. And really all they’re doing in their implementation is automatically inserting those type casts for you. So you get the syntactic sugar, or some of it at least, but you don’t get any of the execution efficiency. So that’s issue number one I have with Java’s solution.

캐스팅으로 인해 효율성은 저하된다 – SUN의 주장과는 정반대죠. 모든 parameterized type은 단순히 raw 객체로 바뀐다. (예를들어, List<String>은 List로, T는 지정한 erasure 또는 Object로.) 이래서야 syntactic sugar에 지나지 않는다.

Issue number two, and I think this is probably an even bigger issue, is that because Java’s generics implementation relies on erasure of the type parameter, when you get to runtime, you don’t actually have a faithful representation of what you had at compile time. When you apply reflection to a generic List in Java, you can’t tell what the List is a List of. It’s just a List. Because you’ve lost the type information, any type of dynamic code-generation scenario, or reflection-based scenario, simply doesn’t work. If there’s one trend that’s pretty clear to me, it’s that there’s more and more of that. And it just doesn’t work, because you’ve lost the type information. Whereas in our implementation, all of that information is available. You can use reflection to get the System.Type for object List<T>. You cannot actually create an instance of it yet, because you don’t know what T is. But then you can use reflection to get the System.Type for int. You can then ask reflection to please put these two together and create a List<int>, and you get another System.Type for List<int>. So representationally, anything you can do at compile time you can also do at runtime.

가장 큰 문제는 List<T>라는 parameterized type이 있을 때, 이것이 어떤 타입인지 알아낼 수 없다. 왜냐하면 컴파일시에 모든 type parameter (이 경우엔 T)를 지워버리기 때문이다. 따라서 컴파일시와 런타임시에 소유한 정보가 다르다. 리플렉션을 포함한 별별 동작들이 안될 것이다 – 이에 대한 JAVA가 제시하는 해결책은 Class<T> 라는 자바내 임의의 클래스를 설명하는 class 객체를 넘겨주라는 것이죠.

Comparing C# Generics to C++ Templates
To me the best way to understand the distinction between C# generics and C++ templates is this: C# generics are really just like classes, except they have a type parameter. C++ templates are really just like macros, except they look like classes.

사실 C++의 템플릿이란 매크로다.

The big difference between C# generics and C++ templates shows up in when the type checking occurs and how the instantiation occurs. First of all, C# does the instantiation at runtime. C++ does it at compile time, or perhaps at link time. But regardless, the instantiation happens in C++ before the program runs. That’s difference number one. Difference number two is C# does strong type checking when you compile the generic type. For an unconstrained type parameter, like List<T>, the only methods available on values of type T are those that are found on type Object, because those are the only methods we can generally guarantee will exist. So in C# generics, we guarantee that any operation you do on a type parameter will succeed.

C++ is the opposite. In C++, you can do anything you damn well please on a variable of a type parameter type. But then once you instantiate it, it may not work, and you’ll get some cryptic error messages. For example, if you have a type parameter T, and variables x and y of type T, and you say x + y, well you had better have an operator+ defined for + of two Ts, or you’ll get some cryptic error message. So in a sense, C++ templates are actually untyped, or loosely typed. Whereas C# generics are strongly typed.

가장 큰 차이는 어디서 타입 체크와 인스턴스 생성이 일어나는가 하는 것이다. C#은 실행시에 이루어진다 – List<T> 가 있을 때 List<int> 가 나타날때까지는 List<int> 란 존재하지 않고, List<T>에 대한 IL만 존재하다가 List<int>을 만났을 때 int라는 타입을 넘겨주면서 List<int>을 생성한다. C++은 정 반대로 컴파일 시에 생성해버리고, 프로그램 시작 전에 인스턴스를 만든다.

또 다른 차이는 타입 T 라는 것이 넘어왔을때 C#의 경우엔 리플렉션의 수행이 가능하다 – 물론 C++엔 그런 개념같은게 거의 전무하지만. 또, C#에서는 T가 Object이므로 확실히 가능한 동작만 수행가능한 반면, C++의 경우 아무거나 할 수 있다 – 물론 갖가지 문법의 장벽을 넘고 넘어야 하지만. 예를들어 C++은 타입 T에 대한 x 와 y를 받아 operator+ 를 오버라이딩 안한 상태에서도 x+y 를 해버릴 수 있다. 그 결과는? 도무지 알아볼 수 없는 에러메시지 – C++의 고질적 문제.

—————————————

이런 점들을 보면 C#이 확실히 좋은 언어가 맞죠. C++이야 물론 성숙한 언어지만, 잘못된 템플릿 사용은 code bloat로 이어지고, 그나마 그걸 막을 수 있는건 ‘experienced C++ programmer’ 들인데, 과연 그런 숙련자가 되기위해서는 얼마나 부단히 노력해야겠수.. C#이 좀더 많은 industry와 library 개발자들의 관심속에 많은 자원을 갖게 된다면 그 때 옮겨가 보도록 하죠.. (가령 제 경우만 해도 C#으로 짠 R-Tree같은 것은 구하기 힘듦..)