Demystifying DLINQ: Part3 - Querying for Data using DLINQ

Posted on 6/30/2006 @ 8:57 PM in #Vanilla .NET by | Feedback | 8915 views

Welcome to Part 3 of my DLINQ series. Following are what we have already seen –

a) Demystifying DLINQ: Part1 - An introduction to DLINQ
b) Demystifying DLINQ: Part2 - Setting up a DataContext Class

In this post, it made sense to solidify our understanding by first creating a DataContext on (what else) the NorthWind database, and secondly to use that DataContext in a few real queries and try and understand what exactly goes on behind the scenes, plus, get a practical understanding as far as “Select” queries go.

So, go ahead and setup your strongly typed DataContext using the following command –

C:\>"c:\Program Files\LINQ Preview\bin\sqlmetal" /server:. /pluralize /database:Northwind  /code: Northwind.cs

Now create a ConsoleApplication – preferably a LINQConsoleApp (so all the references are added for you), and add the NorthWind.cs file to your console application.

Awesome !! Now let us get started. Go ahead and execute the following code –

Northwind db = new Northwind(connStr) ;

var customers =
    from c in db.Customers
    select c ; 

foreach (Customer customer in customers)
{
    Console.WriteLine("Customer Name:" + customer.ContactName) ;
}
 

Console
.ReadLine() ;

You would note an output that looks like as below -

 

Now it is important to realize that the query didn't really execute, UNTIL you were at the foreach loop. The "var customers" is simply "potential to execute a query" - the query executes afterwards.

Now rewrite the query as below -

var customers =
    from c in db.Customers
    where c.City == "London"
    select c ;

Now the results are filtered, by only customers who are in the city "London" as shown below -

The obvious question here is, "What is the TSQL query being executed under the scenes". In the previous CTP, you would have to run profile to "sniff" the queries out. The May CTP includes a fantastic new feature called the "Query Visualizer". Right after the query, say on foreach, put a breakpoint, and then hover over "customers" in debug mode to see the magnifying glass. Click on that to bring up the visualizer, which tells you the exact query being executed under the scenes. This is as shown below -

Now lets make things a bit more interesting. Write a query that looks like as below -

var custOrders =
    from o in db.Orders
    from c in db.Customers
    where c.CustomerID == "ALFKI"
    where c.CustomerID == o.CustomerID

    select new {c.ContactName, o.ShipName} ;

Now when you hover over "custOrders" and check the query out in the visualizer, it looks like this -

SELECT [t1].[ContactName], [t0].[ShipName]
FROM [Orders] AS [t0], [Customers] AS [t1]
WHERE ([t1].[CustomerID] = [t0].[CustomerID]) AND ([t1].[CustomerID] = 'ALFKI')

Interestingly, if in the previous CTP you had changed the order of "c" and "o", the generated query would be completely different. The May CTP is smart enough to understand the difference, and generate efficient queries in both cases. :)

So far so good. We are happily doing joins. The above query, in fact can also be rewritten as -

var custOrders =
    from o in db.Orders
    join c in db.Customers on o.CustomerID equals c.CustomerID
          where c.CustomerID == "ALFKI"
    select new {c.ContactName, o.ShipName};

But wait a minute - we're so used to fetching flat relational data, that we didn't even notice it - we're fetching "rows X columns" again :-/. Indeed the above when executed gives you an output that looks like this -

Well thats no good. Especially when we are used to working with objects as "hierarchies". What we are really trying to achieve here is "Customer Maria Anders (one instance) has a collection of "Orders". So it's a one to many relationship, which means, "Customer" should have a property called "Orders[]" .. i.e. an Order Collection - or something similar to that.

Hmm .. well in database, the closest equivalent we have to such a grouping .. is .. yeah GROUP. So yeah we need to "Group" the results. That was easy :). And here is how you would group -

var custOrders =
    from c in db.Customers
    join o in db.Orders on c.CustomerID equals o.CustomerID into orders
    where c.CustomerID == "ALFKI"
    select new {c.ContactName, orders};

var list = custOrders.ToList() ;

foreach (var listItem in list)
{
    Console.WriteLine(listItem.ContactName + " has " + listItem.orders.Count() + " orders, which have been shipped to:") ;
    foreach (Order order in listItem.orders)
    {
        Console.WriteLine(" Order shipped to - " + order.ShipCountry) ;
    }
}

This produces the following output -  

 

Also, you can check the generated TSQL generated using the Query Visualizer. There are two queries generated in this case -

SELECT [t0].[ContactName], (
SELECT COUNT(*)
FROM [Orders] AS [t1]
WHERE [t0].[CustomerID] = [t1].[CustomerID]
) AS [orders]
FROM [Customers] AS [t0]
WHERE [t0].[CustomerID] = 'ALFKI'
ORDER BY [t0].[CustomerID]

and

SELECT [t1].[OrderID], [t1].[CustomerID], [t1].[EmployeeID], [t1].[OrderDate], [t1].[RequiredDate], [t1].[ShippedDate], [t1].[ShipVia], [t1].[Freight], [t1].[ShipName], [t1].[ShipAddress], [t1].[ShipCity], [t1].[ShipRegion], [t1].[ShipPostalCode], [t1].[ShipCountry]
FROM [Customers] AS [t0], [Orders] AS [t1]
WHERE ([t0].[CustomerID] = 'ALFKI') AND ([t0].[CustomerID] = [t1].[CustomerID])
ORDER BY [t0].[CustomerID], [t1].[OrderID]

.. truthfully, not bad to have all that work done for you !!!

Now, there was another RATHER interesting thing that happened in this code above. Examine this line again -

var list = custOrders.ToList() ;

The ToList method effectively executed the queries right then, and loaded 'em up in the anonymous type "list". So while the foreach loop is running, your database connection is closed, being pooled by other requests - at the cost of higher memory usage.

.. there are a few other important concepts concerning "Selection of data" using DLINQ. I would recommend playing with DLINQ queries on your own, and I will return with another post on "Selecting" data - with a few new funky concepts :), and then move on to other topics within DLINQ. .. yeah you guessed it right, DLINQ is a frickin' huge topic already.

 

Sound off but keep it civil:

Older comments..


On 2/11/2008 3:34:22 PM Scott Davis said ..
Sahil,

Great content. Thanks for saving me countless hours. I have a related question. So I have a data class (table) for Linq and I have a data class (DataContract) for WCF. Is there an easy way to use the same class for both, or can I easily transfer data from one to the other? It would seem silly to me with all these advances if I were to write a for loop, creating objects and marshalling data from one data class to another. I'd like to do it the "smart" way, but I'm afraid I'm not that smart yet. Thanks....


On 4/7/2008 5:43:56 PM Trey said ..
Your example of "grouping" through joins was very helpful. However, I have situation where 3 tables must be joined. Logically, each "Publication" has a collection of "Articles" (much like your Customer/Order example). The difference is that any particular Article may appear in multiple publications. For example, "Here's the News!" might belong to both the "Team News" and "Supervisor's News" publications. Thus I had to make a middle table that contains 2 fields (PublicationID and ArticleID) as joint primary key, so that the article itself is only in the db once, but can be tied to multiple publications.

So the table relationships look like this:

Publications (One-To-Many) PublishedArticles (Many-To-One) Articles

My question is, how do I return a list of the article items belonging to a particular location? The data set I'd like to return should look like your example above:

* the name of the Publication queried


* the collection of Articles for that publication

Thank you a ton... i'm really enjoying DLING and your articles and how-tos.


On 11/2/2009 6:44:33 AM Manoj Agarwal said ..
Hi Sahil,

For the last Linq query (with join) why should there be two queries run? There should just be one query. I ran this through a profiler and found that only this query is run (which is correct)-


SELECT [t1].[ContactName], [t0].[ShipName]


FROM [dbo].[Orders] AS [t0], [dbo].[Customers] AS [t1]


WHERE ([t1].[CustomerID] = [t0].[CustomerID]) AND ([t1].[CustomerID] = @p0)

BTW- These are great tutorials!


On 1/4/2010 1:34:17 AM Amit Trivedi said ..
Great content to Learn DLinq !!!