Preamble
LINQ is a modern query language which presents a simple, yet powerful way to query data-collections in a variety of sources. It offers a consistent model for working with data from a variety of sources (XML documents, SQL databases, ADO.NET databases, .NET collections etc.)
Motivation
LINQ presents an opportunity to manipulate data from within a .NET based environment - raising the bar for software-based data-manipulation and querying. In addition it reinforces learning-concepts and interaction modalities within the C# language.
(Inner) Joins
Join simply put groups data across two collections and are a common operation in data-querying and management. The syntax LINQ employs follows common convention.
First lets build our data objects.
// 1.
internal class Supplier
{
public string Name { get; set; }
public int Age { get; set; }
public string District { get; set;}
}
internal class Buyer
{
public string Name { get; set; }
public int Age { get; set; }
public string District { get; set; }
}
List<Buyer> buyers = new List<Buyer>()
{
new Buyer() {Name = "John", Age = 25, District = "Point Break"},
new Buyer() {Name = "Paul", Age = 35, District = "West Side"},
new Buyer() {Name = "Angie", Age = 30, District = "West Side"},
new Buyer() {Name = "Flo", Age = 45, District = "Central"},
new Buyer() {Name = "Mary", Age = 28, District = "Central"},
new Buyer() {Name = "Joan", Age = 30, District = "South"},
new Buyer() {Name = "George", Age = 55, District = "South"}
};
List<Supplier> suppliers = new List<Supplier>()
{
new Supplier() {Name = "Andy", Age = 25, District = "Point Break"},
new Supplier() {Name = "Tod", Age = 35, District = "West Side"},
new Supplier() {Name = "Ian", Age = 40, District = "West Side"},
new Supplier() {Name = "Anna", Age = 30, District = "South"},
new Supplier() {Name = "Jerry", Age = 28, District = "Central"},
new Supplier() {Name = "Paul", Age = 33, District = "Central"}
};
And then join on a single key. The clauses join...in...
and on...equals
are used to define the relationships between the collections.
Here with on...
we can refer to the first range variable (s), and with equals
the second (b).
The results are placed in an Anonymous object via new
.
// 1. ....continued
var innerJoin = from s in suppliers
join b in buyers on s.District equals b.District
select new
{
SupplierName = s.Name,
BuyerName = b.Name,
s.District
}
foreach (var item in innerJoin)
{
Console.WriteLine($"District: {item.District}, Supplier Name: {item.SupplierName}, Buyer Name: {item.BuyerName}");
}
District: Point Break, Supplier Name: Andy, Buyer Name: John
District: West Side, Supplier Name: Tod, Buyer Name: Paul
District: West Side, Supplier Name: Tod, Buyer Name: Angie
District: West Side, Supplier Name: Ian, Buyer Name: Paul
District: West Side, Supplier Name: Ian, Buyer Name: Angie
District: South, Supplier Name: Anna, Buyer Name: Joan
District: South, Supplier Name: Anna, Buyer Name: George
District: Central, Supplier Name: Jerry, Buyer Name: Flo
District: Central, Supplier Name: Jerry, Buyer Name: Mary
District: Central, Supplier Name: Paul, Buyer Name: Flo
District: Central, Supplier Name: Paul, Buyer Name: Mary
Composite Joins
Composite Joins occur when multiple keys are matched across the collections.
The comparison components are held in Anonymous objects - it’s important to note that their types must match, as well as their ordering within the respective objects.
We can capture the results (and the common keys, District and Age) in Supplier by referencing s
directly.
In this case we only have 4 matches where Buyer/Supplier Age + District are identical.
// 2.
var compositeJoin = from s in suppliers
join b in buyers on new {s.District, s.Age} equals new {b.District, b.Age}
select new
{
Supplier = s,
BuyerName = b.Name
};
foreach (var item in compositeJoin)
{
Console.WriteLine($"District: {item.Supplier.District}, Age: {item.Supplier.Age}");
Console.WriteLine($" Buyer Name: {item.BuyerName}");
}
District: Point Break, Age: 25
Buyer Name: John
District: West Side, Age: 35
Buyer Name: Paul
District: South, Age: 30
Buyer Name: Joan
District: Central, Age: 28
Buyer Name: Mary
Group(ed) Joins
Group(ed) Joins fuse two collections by a common key, with a grouping of the secondary collection on the key.
This presents an improvement on standard inner joins where the common properties are (repeatedly) present in every unified item.
The syntax for the join is identical to that of the inner join with the additional of into...GroupName
.
The grouping variable inside of the Anonymous Object new
is itself of type IEnumerbale
- and thus a nested loop is required to display the elements within.
The names of the items referred to within Console.WriteLine
help make clear which of the data entities are being iterated through.
// 3.
var groupJoin = from s in suppliers
join by in buyers on s.District equals b.Distrit into buyersGroup
select new
{
s.Name,
s.District,
Buyers = buyersGroup
}
foreach(var supplier in groupJoin)
{
Console.WriteLine($"Supplier: {supplier.Name}, District: {supplier.District}");
foreach(var buyer in supplier.Buyers)
{
Console.WriteLine($" Buyer Name: {buyer.Name}, Buyer Age: {buyer.Age}");
}
}
Supplier: Andy, District: Point Break
Buyer Name: John, Buyer Age: 25
Supplier: Tod, District: West Side
Buyer Name: Paul, Buyer Age: 35
Buyer Name: Angie, Buyer Age: 30
Supplier: Ian, District: West Side
Buyer Name: Paul, Buyer Age: 35
Buyer Name: Angie, Buyer Age: 30
Supplier: Anna, District: South
Buyer Name: Joan, Buyer Age: 30
Buyer Name: George, Buyer Age: 55
Supplier: Jerry, District: Central
Buyer Name: Flo, Buyer Age: 45
Buyer Name: Mary, Buyer Age: 28
Supplier: Paul, District: Central
Buyer Name: Flo, Buyer Age: 45
Buyer Name: Mary, Buyer Age: 28
Inner Group(ed) Joins
As we saw, the grouping variable inside the Anonymous Object is an IEnumerbale
- and thus it can also be queried with LINQ (from...in
, orderby
etc.) - offering full flexibility for sorting, conditionals etc.
Taking the example from above.
// 4.
var groupInnerJoin = from s in suppliers
join by in buyers on s.District equals b.Distrit into buyersGroup
select new
{
s.Name,
s.District,
Buyers = from b in buyersGroup
orderby b.Age
select b
}
foreach(var supplier in groupInnerJoin)
{
Console.WriteLine($"Supplier: {supplier.Name}, District: {supplier.District}");
foreach(var buyer in supplier.Buyers)
{
Console.WriteLine($" Buyer Name: {buyer.Name}, Buyer Age: {buyer.Age}");
}
}
Supplier: Andy, District: Point Break
Buyer Name: John, Buyer Age: 25
Supplier: Tod, District: West Side
Buyer Name: Angie, Buyer Age: 30
Buyer Name: Paul, Buyer Age: 35
Supplier: Ian, District: West Side
Buyer Name: Angie, Buyer Age: 30
Buyer Name: Paul, Buyer Age: 35
Supplier: Anna, District: South
Buyer Name: Joan, Buyer Age: 30
Buyer Name: George, Buyer Age: 55
Supplier: Jerry, District: Central
Buyer Name: Mary, Buyer Age: 28
Buyer Name: Flo, Buyer Age: 45
Supplier: Paul, District: Central
Buyer Name: Mary, Buyer Age: 28
Buyer Name: Flo, Buyer Age: 45
The data is now ordered (in ascending order) as expected.
(Left) Outer Joins
It begs the question - what happens in cases where the key does not match across the two collections? In these circumstances to is (often) desirable to return the items of the joined collection regardless of a match or not.
A non-matching Supplier was added to our Suppliers collections - no Buyers match this key.
The query looks the similar to the Group Join or Inner Group Join, building a result with select new
but the DefaultIfEmpty()
method is called, and a Buyer
Object is created to define the properties within the given unmatched item.
// 5.
var leftOuterJoin = from s in suppliers
join b in buyers on s.District equals b.District into buyersGroup
select new
{
s.Name,
S.District,
Buyers = buyersGroup.DefaultIfEmpty(
new Buyer()
{
Name = "NA",
District = "No District Matches"
}
)
};
foreach(var supplier in leftOuterJoin)
{
Console.WriteLine($"{supplier.District}, {supplier.name}");
foreach(var buyer in supplier.Buyers)
{
Console.WriteLine($" {buyer.District}, {buyer.Name}");
}
}
Point Break, Andy
Point Break, John
West Side, Tod
West Side, Paul
West Side, Angie
West Side, Ian
West Side, Paul
West Side, Angie
South, Anna
South, Joan
South, George
Central, Jerry
Central, Flo
Central, Mary
Central, Paul
Central, Flo
Central, Mary
Forest, MrP
No District Matches This Buyer, NA