Never fail an interview question about System.String or StringBuilder again !!!!

Posted on 7/5/2006 @ 11:36 AM in #Vanilla .NET by | Feedback | 1958 views

So do you (like me) get pissed off annoyed when someone asks you a crazy question about System.String because it is the retarded class in .NET? Seriously !!! Anyway, here is the blog post to put an end to all those smartass interview questions.

So what is the weird thing about System.String? Well, it ain't truly a value type, and it ain't truly a reference type. It is what u may refer to as "an immutable object". Wotzatmean?

Check this code out --

static void Main(string[] args)
{
   
int i ;
   i = 0 ;
   myFunction(i);
   Console.WriteLine(i) ;
}

private static void myFunction(int i)
{
   i = i + 1;
}

The above code will print "0". (Surprised?). Why is that? Well it's because the "i" (value type) didn't get passed, it's copy got passed.

Now modify the above to the following ---

static void Main(string[] args)
{
   DataTable dt = 
new DataTable();
   dt.Namespace = "Monkey";
   myFunction(dt);
   Console.WriteLine(dt.Namespace) ;
}

private static void myFunction(DataTable dt)
{
   dt.Namespace = "Modern Man";
}

The above code will print "Modern Man", (Hmm thats weird), and this is because DataTable being an object is a "reference type". In other words, when you passed it to myFunction, a copy of the value of the reference to the actual object was made, a copy of the actual object was NOT made. Which means, because the new copy was still pointing to the same old memory location (that held your data table), guess what - the original datatable got modified.

Now lets look at the very same code using String.

static void Main(string[] args)
{
   String str = "Monkey";
   myFunction(str);
   Console.WriteLine(str);
}

private static void myFunction(String str)
{
   str = "Modern Man";
}

Now when you run this code, what do you get .. Monkey .. or Modern Man? You get "Monkey", but wait a minute, isn't String an Object? In fact it is a public sealed class .. so WTF happened? Well it is an object, but it is an immutable object. Which means at this very line of code ---

   str = "Modern Man";

A brand new instance of string was declared in memory. The previous instance (the one that was passed in as a parameter), is left orphaned by myFunction. Note - that still won't be garbage collected because static void Main is still using it.

So lets get this straight, anytime I reassign a value to a string object type, a brand new memory allocation occurs? And this is because the previous memory is immutable? The only way you can reuse it is basically if you garbage collect and reallocate it (which is really the framework's job).

So lets see now, look at this code below ---

String str = "Sahil " + "is" + " a" + " modern" + " man";

So right in the code above, how many times did the memory get allocated, and then de-allocated? Dude in the above code, first the framework will declare memory for all 5 strings, and as you concatenate them, it will have to reallocate, and copy memory all over, and over, and over and over again until it gets the final "str". This is why, you should use StringBuilder - because that is not an immutable object. It has the ability to reuse the same memory.

Now this has other implications too, but as long as you remember to say "Strings are immutable" and sufficiently explain what you meant by that, your interviewer will be more or less happy.

 

Sound off but keep it civil:

Older comments..


On 7/5/2006 12:27:31 PM Adam said ..
Actually...

String str = "Sahil " + "is" + " a" + " modern" + " man";

Is 6 individual strings. The 5 for the ones embedded in there, then the C# compiler translates all that into a string.Concat method call so there's only one more string instance used.

If you had:

str = "Sahil";


str += "is";


str += "a";


str += "modern";


str += "man";

Then you would end up with 5 plus one per concat operation.


On 7/5/2006 12:52:51 PM Eber Irigoyen said ..
"String str = "Sahil " + "is" + " a" + " modern" + " man";

So right in the code above, how many times did the memory get allocated, and then de-allocated? Dude in the above code, first the framework will declare memory for all 5 strings,


"

are you sure?


On 7/5/2006 1:38:03 PM Will Rickards said ..
In your last example:


String str = "Sahil " + "is" + " a" + " modern" + " man";

This wouldn't be 5 memory allocations. The whole string on the right is a constant and should be optimized away as such, thus only one memory allocation. But yes in general if what is on the right isn't a constant and would result in more than 3 memory allocations, use StringBuilder.


On 7/5/2006 1:45:18 PM Ayende Rahien said ..
Um, no.


String str = "Sahil " + "is" + " a" + " modern" + " man";


This has no string concantation.

ldstr "Sahil is a modern man"

Is what it looks like in the IL.


On 7/5/2006 1:59:16 PM Sahil Malik said ..
Okay did you guys check that in C# 1.1 or 2.0? I was under the impression that this optimization has been introduced in 2.0 .. can anyone confirm? (I don't have 1.x installed).


On 7/5/2006 2:14:31 PM Adam said ..
In 1.1 it's a string.Concat. In 2.0 it's a constant value as Ayende suggested.

I didn't know about that optimization. It's of limited value (it seems to only optimize for things on the same line), but still nice.


On 7/5/2006 2:56:43 PM Sahil Malik said ..
Thanks Adam. *Whew* for a moment, I was thinking I was going crazy. .. well .. I mean, I was going crazier :).


On 7/5/2006 3:29:07 PM Adam said ..
To Will:

Actually if you're using one line of code to concatenate strings, a StringBuilder is less efficient than just using the + operator. The non-constant + gets transformed into a string.Concat in both 1.1 and 2.0 and that method is more efficient than using a StringBuilder.


On 7/5/2006 9:36:25 PM Kent Boogaart said ..
Another thing is your example is a little misleading. Modifying a reference parameter will never propagate back to the caller (unless the reference parameter is itself passed by reference).

static void Main(string[] args)


{


object o = new SomeClass("some ID");


myFunction(o);


Console.WriteLine(o);


}

private static void myFunction(object o)


{


o = new SomeClass("Some other ID");


}

That will print "Some ID" assuming SomeClass' ToString() has been overridden to output the ID. But if you change it to pass by reference:

private static void myFunction(ref object o)


{


o = new SomeClass("Some other ID");


}

It will output "Some other ID". The same goes for your string example:

static void Main(string[] args)


{


String str = "Monkey";


myFunction(ref str);


Console.WriteLine(str);


Console.ReadKey();


}

private static void myFunction(ref String str)


{


str = "Modern Man";


}

This will output "Modern Man".


On 7/5/2006 10:20:32 PM Sahil Malik said ..
So I'm confused. Why is it misleading? :-/


On 7/6/2006 1:02:19 AM Chris Ongsuco said ..
I don't see any misleading line in the post. Maybe he got confused? :-P


On 7/6/2006 1:08:15 AM Kent Boogaart said ..
[quote]Now when you run this code, what do you get .. Monkey .. or Modern Man? You get "Monkey", but wait a minute, isn't String an Object? In fact it is a public sealed class .. so WTF happened?[/quote]

The same thing that would happen regardless of what reference type you are passing - string or otherwise. Unless you pass the reference by reference then you won't be able to modify the object that it refers to and have that change propagate back to the caller. That is why I think the example is a little off-topic or misleading if you will. It really has nothing specific to do with strings. Or maybe I'm just too tired...


On 7/6/2006 8:40:36 AM Sahil Malik said ..
Kent, okay maybe you have a point - but I really think we're splitting hairs @ this point. The above 3 examples, I thought were in perfect juxtaposition - but the point of the post was the demonstrate how String behaves, and I think that got across.


On 7/19/2006 4:55:45 AM jokiz said ..
sahil, i believe this one was discussed in your previous blog post (i've read this before) so i created a thread here (http://msforums.ph/forums/102913/ShowPost.aspx). this is true even in 1.1 and the compiler i believe is behind this scene.


On 1/30/2008 4:35:26 AM Dotnet new bie said ..
Thats great thanks for that. The explanation is spoton. Now I understand WTF is happening.


On 7/10/2008 10:32:43 AM Aaron said ..
You don't ever return int, so its no wonder it prints out 0. How can you expect a void function to pass out the value of something generated inside of it? Inside myFunction, i'm sure i = i + 1, which is 1. But no return and no reference for int &i in the function declaration means anything done inside of that function will stay inside of that function. C++ 101


On 7/10/2008 10:33:22 AM Aaron said ..
You don't ever return int, so its no wonder it prints out 0. How can you expect a void function to pass out the value of something generated inside of it? Inside myFunction, i'm sure i = i + 1, which is 1. But no return and no reference for int &i in the function declaration means anything done inside of that function will stay inside of that function. C++ 101


On 7/10/2008 11:05:23 AM Sahil Malik said ..
Aaron - If you're really good, you can make void return int ;)

Okay yeah, there was a typo. So what!


On 7/19/2008 3:26:37 AM sunil kumar rauto said ..
actually i am faced any interview but result is not coming thats my problem and i hve more confident but what my problem i don't pls u give me some few step how i face the interview and how to handle the interview