|
This interview is based on C# In Depth, Second Edition. It is being reproduced here
by permission from Manning Publications. Manning Early Access Program (MEAP) books and ebooks are sold exclusively through
Manning. Visit the book's page for more information.
You can get an instant 40% discount on this MEAP edition by simply clicking on the
link above, and use promo code egghead40 |
BROMBERG: Hi, this is Peter Bromberg. I'm on with Jon Skeet. I met Jon some years ago on the
Microsoft campus. There were a lot of discussions about C# and its evolution,
and I can't remember Jon if that was C# 3.0 or what - I think it was C# 3.0.
SKEET: I think that we were talking about C# 3.0 but C# 2.0 was about to be released.
BROMBERG: I see. They're always one step ahead of us, aren't they? But anyway, Jon now works
for Google, and of course we should tell
you that whatever Jon says is his own view and not that of his employer or any of
its affiliates. So, Jon you've got a new edition of your C# In Depth book out,
its edition two, which I've read partially. I especially like the way that it's
laid out. I wonder if you'd tell us a little bit about your motivation for writing
this book.
SKEET: So I'd previously worked with Manning, the publisher of Groovy In Action and was
looking for something to do. I was always more keen on C# than Java which is
the other language I primarily work with. I had a few ideas knocking around for
a Java book, possibly looking at some of the core bits of Java - string handling,
number handling, date handling things that are really important that people often
get wrong. This was about the time that C# 3.0 had been announced and there were
some draft specifications going around. I was just chatting with an editor she
talked to me about writing on C# and I absolutely jumped on it. For me it's fun
to write and I always find I learn a lot about whatever I'm writing about. I
think Joe Duffy said something about you never know how much you don't know about
a topic until you start having to try to write about it. If you try to explain
something in detail you really find out the bits that you just sort of skimped
on before. So it was a chance for me to learn much more about C# and really study
it properly - and I just enjoy writing about topics that I'm enthusiastic about.
BROMBERG: And C# 3.0 was a really major move forward in my opinion - it really showed there
are people with some real vision working on that language.
SKEET: Absolutely - but at the same time there are loads of developers struggling to get
to grips with what everything means and bits that at first look like black magic.
One of my motivators was really trying to rip away all the magic so that people
could really understand what was going on under the hood. So as C# has evolved
so many more of the features really involve the compiler being quite smart. Think
anonymous types, LINQ query translations and particularly in C# to iterator blocks.
The compiler's doing huge amounts of work behind the scenes for you , and you
can sort of see what's going on just by reading a few blog posts, doing a tutorial
etc. But I wanted to get one level underneath that to say, okay in the case of
iterator blocks it's having to build a state machine for you behind-the-scenes.
What are the consequences of that? Well it means it's got an extra type and all
your previously local variables end up being members of that type and - oh, it
implements IDisposable; what happens if we don't dispose of things and how finally
blocks are really implemented under the hood. All kinds of things sort of "come
out" as soon as you start looking at the real details and that's the level
that I certainly like to understand the language... and I'm hoping that my readers
do too.
Generics is another example: people see how to use generics for collections but they
may never even consider writing their own generic types or their own generic
methods; seeing how the different features hang together and what's available
just puts you in a much better position to take advantage of the language to
the fullest.
BROMBERG: Yeah, and you certainly have done that in this latest book. I like the way it's
laid out, it's very progressive, and each chapter kind of lays the groundwork
for the next one. Now, you used to be very active in the C# newsgroup, and I
think you've probably moved yourself over to Stack Overflow because I looked
at the rankings and you were up at the top.
SKEET: Yeah, so that happened just over two years ago I remember I think it was September
26th or something like that. I looked at Stack Overflow - one afternoon; I'd
heard about it before but not really tried it. And I have to admit it was a vanity
search - I'd done a search on "C# In Depth" and found that someone
had referred to it on Stack Overflow so I thought, "OK, I'll have a little
look at this, see whether there are any points that I need to comment on"
- whatever, and if people are talking about the book I'd like to make sure that
if they're saying nice things, that's great; but if they're saying bad things
I want to find out about that to improve the book. So I had a little look and
I thought, "I can answer one of these; OK let's answer this." And it
was very much like a newsgroup sort of answer. And there was another, and another
- and I very quickly realized that this was something I would be spending a lot
of time on. I had tried some web forums before, but somehow they didn't have
nearly the polish that Stack Overflow did.
BROMBERG: Well that's interesting. I think they wrote that in MVC, didn't they?
SKEET: They did, yes - beta at the time the first release version didn't come out until
after Stack Overflow had gone live. It's little things like the markdown editor
for one thing the markdown itself is nice and terse but expresses what you need
to express certainly for coding questions. Having a little preview there that's
just live as you update it, you knew what's going to happen. So there are various
things where there are advantages already to Stack Overflow over the newsgroups
which are just plain text. People paste HTML but that's generally regarded as
a bad thing.
SKEET: I love newsgroups for various other reasons I like their threading model, for example
whereas Stack Overflow is largely flat, but I could see very quickly the richness
of Stack Overflow... its sheer energy. You could tell that there were loads
of people there; that they were interested - they were all over the questions.
If you post an interesting well-written question on Stack Overflow you will
get people swarming all over it - happy to give you their experiences, its great.
BROMBERG: Yes, I've referred to it and you know, it's only as good as the people who populate
it. So let's go back to your book. I wonder if you could just kind of go over
what you think are some of the most confusing bits of C#.
SKEET: So I think one of the really big things that confuses a lot of people is how variables
are handled within first - anonymous methods from C# 2.0, and then lambda expressions
in C# 3.0. You've got to get this notion that it's really capturing the variable
not its initial value... which is particularly confusing if you come from Java
where they have anonymous inner classes but you can only refer to final variables.
It just copies the value when it creates the instance of the anonymous inner
class.. but in C# no, you really are capturing the variable and that looks really
weird to start with because it's a local variable but... what happens if we're
returning a delegate, how can it possibly...? Surely that variable has been been
on the stack as we tended to think before. But no, you look under the hood again
and it's created an extra type and captured the variable in that place and created
all the necessary instances. And it's all really clever - so long as you know
what to expect and particularly when it comes to loops. So I don't know whether
you've actually written a for loop or a foreach loop, and if you use that variable
- the iteration variable - within a lambda expression, to create some kind of
delegate that you're keeping around, or maybe it's starting a new thread.
SKEET: So the example I often give is: for each URL in this collection of URLs start a
new thread to fetch the URL - and you express the idea of "fetch that URL"
with a lambda expression. That sounds really straightforward, right? A very
good example of lambda expressions making it easy to express a task, except you
end up with just occasionally one URL being fetched twice or one not being fetched
at all. And you think, "Well, what's happened here?" And actually each
thread ends up using the URL variable which may have incremented or gone on to
the next iteration before the thread started so it skips one maybe two of them
are using the same one and there's this weird way around it of creating a copy
of the iteration variable within the loop and because of the way that you get
sort of different instances of local variables which sounds weird to start with,
right? But because those are captured separately, you end up sorting it out.
But it's really unintuitive the first time you see it. Once you've done it once
you sort of know what to expect - and that's generally the case. Another example
of exactly that is iterator blocks: if you want to write a method that's going
to return - well - if you were writing the equivalent of "select" from
LINQ it's really see you get your source, you get your projection; you iterate
over every item in the source you apply the projection and you yield return it
- dead easy -- until you start putting in parameter validation. So you say,
"well if source equals null then then throw ArgumentNullException; if projection
equals null throw ArgumentNullException - and then you try it - and if you try
it with a foreach loop - foreach result in select null - ahah! - it will go bang
- and it will indeed. But if you just call select - then none of your code executes
because of the whole lazy way that yield return and iterator blocks work and
you have to go through the hassle; in that case you split something out into
two methods that have different bits for the parameter validation than for the
iteration - and it's a bit of a pain but again once you know it is not too much
of a problem. So there aren't that many little traps in C# compared with some
other languages it's really pretty clean. And for most of them, once you know
what's happening, easy enough to predict easy enough to work around. But in these
cases if no-one points them out to you, you could waste hours on that kind of
thing.
BROMBERG: Interesting. What would you say are some of the most distinguishing features that
have improved C# over the last few versions?
SKEET: Well in some ways you've got some features that felt like they could really have
been in C# 1.0 to start with. Generics is the obvious example or maybe nullable
value types. Things that are so fundamental within the type system it would've
been really nice to have them from the start. So generics has been hugely important
but because it's sort of "foundational" it doesn't feel particularly
innovative whereas LINQ in particular has just changed the way that thousands
and thousands of programmers think about dealing with data and it's something
we've done so often. Every time you write a loop you think "what am I doing
in here - am I taking some actions or am I just doing the same sort of filtering
or finding the maximum value - the kind of things that we've written dozens and
dozens of times and it's filled that gap really, really nicely.
BROMBERG: Oh yeah. LINQ has totally changed the way I think about writing code. I try to use
it at every opportunity - because it's just efficient.
SKEET: Yeah, it's an efficient way of writing it - it can be very very efficient - more
efficient than the way you might write it normally if you'd use a List<T>
or something and you fetched all the data first, and then you project all the
data, and then you filter some of the data and whatever... the way that LINQ
does things lazily and in a sort of streaming manner where it can, can make it
even more efficient. You need to know what's going on a bit but then that's
hopefully where the book helps. But also LINQ has introduced people to lambda
expressions to a large extent and once you got the idea of "here I can represent
this function" it becomes extremely powerful... I use delegates far more
in my code than I used to. Certainly in C# 1.0 it was a bit of a joke - you had
to write an extra method, a real pain even just creating an instance of the delegate
was a bit of a pain. And C# 2.0 - much nicer but still not quite there in it
being really easy to produce these inline delegates and then with lambda expressions
it's just so much nicer.
BROMBERG: Yes, everthing's inline and it just works. It simplifies, I think, making the code
more readable.
SKEET: Absolutely. And it encourages thinking in a sort of functional way. Now there are
sort of upsides and downsides when it comes to C# 3.0 in terms of thinking functionally.
Obviously LINQ is primarily functional. You *can* write query expressions that
have side effects but it's a really bad idea and you're highly encouraged to
think in terms of small functions. On the other hand they made it easier to write
mutable types without making immutability easy as well. Hopefully, maybe that
will be addressed at some point but certainly just the idea of thinking in terms
of functions that can be composed - hopefully it's opened a lot of people's eyes
it certainly has for me. Leading people towards maybe F# where it's appropriate
and again F# is a great language to learn just to encourage you to think functionally
back in C# again. I think it's really been opening the eyes of a lot of developers
worldwide and really bringing them on in terms of the whole way that they approach coding.
BROMBERG: Interesting. Let's talk a little bit farther on. You know there are some new proposed
C# 5.0 features that have already surfaced. For example the Async CTP and so
on.
SKEET: I watched Anders' talk a few weeks ago and read the spec and watched the video and
I was fortunate enough -- really fortunate -- to have Mads Torgersen visit on
his way to Tech Ed Europe. He came and stopped by our office and gave us a talk
about the bits of Async behavior that weren't quite right for me. The principal
premise of C# 5.0 as far as I can express it is the ability to write asynchronous-looking
code that has well-defined points of asynchronicity so you write code: ABCDE
as you always have and you think of it as executing in that order and that's
fine and then you think, well, actually step C is only fetching something from
the web and that could take a long time; I don't want to be tying up my thread
while that's happening so you "await" C and the way that works -- it's
very complicated internally although quite beautifully described in the spec
-- but effectively it's back building a state machine just like with iterator
blocks. So, it says OK - start doing whatever C was and then when that's finished
come back to do D and E - the last two bits of the method. And you think, "OK
- I can understand that, sort of, because it uses the Task Parallel Library
(TPL) from .NET 4. So OK, I've got this method that's effectively going to stop
halfway through. Well, what does that mean for the method itself? It can't return...
Suppose it was meant to return the total from some stock market or whatever -
it can't return the total yet because half of it's not executed. OK, so you
think of bringing the asynchronicity up one level and you have to declare this
method that contains "await" statements, you declare it as being async
and instead of returning say a decimal for the stock market total, it returns
a Task<decimal>. So then the caller can either call your method synchronously
and then wait for the results or they can call it and add a continuation on themselves
so often you'll get chains of these asynchronous calls where one await feeds
into another await which feeds into another await and if you imagine a webservice
doing lots of other webservice RPCs and maybe a bit of parallel processing as
well -- all kinds of things going to different services and taking up different
types of resources on various machines it lets you work with that very efficiently
but at the same time expressing it in a way that we're comfortable with. So writing
asynchronous code been possible for a long time just really messy. It's not so
bad getting the success part right -- but trying to get all the errors propagated
is a bit of a nightmare. So hopefully - we only have a technical preview at the
moment - but I have very high hopes for what C# 5.0 will bring.
BROMBERG: That's fantastic. And of course the Task object is really the key to implementation.
SKEET: Abolutely.
BROMBERG: Well, great Jon. I'm about done, I'd like to thank you for talking with us. We've
been interviewing Jon Skeet, the author of C# in Depth, Second Edition.
SKEET: My pleasure, it's been lovely to talk with you.
BROMBERG: Thanks, Jon. So long.
You can listen to the WMA Audio of the conversation here.