What is logical query processing order in SAS?

Firstly, it reduces typing, which as we’re all aware, we are all trying to minimize. Secondly, if this same column name sits in both the tables, then of course, here it’s not a problem. Because the Chef table has a column called Name, the Spices table has a column called Recipe name. So here it wasn’t a big problem. But assume that the same column exists in multiple tables, and we want to join by those same columns. Then, I would need to provide a qualifier. And in that event, the qualifier, which is short and sweet, makes a lot more sense. Is this query going to give me the same results? We’ll take a look. And indeed, I’m getting the same results. So the question of a join would be incomplete if I did not mention efficiency. And I’m going to show you an alternative way of crafting your inner join by using the WHERE clause.

I’m, again, going to stick those aliases. Dinner. Chef as C, dinner. Spices as S, so that I can scrap this lengthy. S.name equals– I’m sorry. C.name equals S.Recipename. What was the difference in this code? The alternate syntax no longer uses the inner join words or keywords. It separates the two tables with a comma. And then in place of the ON clause, you use the WHERE clause to provide not only any filter that you need but also any join condition. So that works. Does it? We’ll take a look. And the results are the same. The difference is the efficiency. And here, I’m going to enter the space of the logical query processing order. Up until now, in section one, you learned the syntax order, and you knew how to respect the syntax of SQL.

Now, we look at the logical query processing order. And that means this is the way SQL processes your query and actually does the work under the covers. Surprise! It’s not going to do the work in the order in which it asks you to, speak to SQL. It’s going to do it in a completely different manner. So one of the first things that SQL will tackle in this logical query processing order is the FROM clause. So the FROM clause is the first processing. And then, it’s going to tackle the WHERE clause. So if we look at the inner join where we use the ON clause, can you see where there’s a little efficiency boost by using the ON clause? Because the ON clause is really part of FROM. Whereas, WHERE is the next statement. So that’s something to bear in mind as you craft your query in the difference between the WHERE and the ON. That’s one difference. The second difference is when I use the WHERE clause. And here in this example, I only have two tables.

What if I had 20 tables? Or maybe 30 tables. My WHERE condition becomes very complex, and I have to make sure I provide all of the join conditions correctly. If I leave it out, I might end up with the Cartesian product. So those are the two reasons why you might want to head over and use the ON clause for providing your join condition. Before we dive into code to look at the subquery, it would be a nice reminder to look once again at the concept of a subquery, and how it plays out. So I’d like to show you to visuals here. The first one is about the tables themselves.

Here is Spice’s table and the business question is, what dishes have a moderate heat rating, which is less than or equal to 3? This column heat scale exists in the Spices table. So that’s the one we are going to tackle. We’re going to grab rows from the Spices table, where that heat scale is less than or equal to 3. The rows that get returned need a purpose, need somewhere to go. And the only reason we are grabbing these rows is that so that we can see the chefs that correspond to this moderate heat rating. So we can ask them more questions about their dishes. So those rules are passed to the Chef table. And so there’s a match between the Recipe name on the Spices table, and the Name on the Chef table.

Let’s take another look at the visuals. So the rows that are returned are these. And this is the Recipe name. And this would be the subquery. We’re grabbing rows from the Spices table. Once these rows are returned based on my WHERE clause, I’ll pass these rows to the Chef table, so that now I can get the names of the Chef, the country of recipe origin, the link from the Chef table. So you can almost visualize that it is like a join, but we are crafting it as a subquery. Let’s head over to our SAS session and see this play out in code. We like to grab those rows that have a moderate heat rating from the Spices table.

The moderate heat rating is where the heat scale is less than or equal to 3. So my inner query or my subquery will return me the rows that have this heat rating on that six rows of data. Now, got pass these rows to the Chef table, so that I know the names of the chefs and I can contact them for more information. To do that, I’m going to have to copy these values from the Results window and go back and plunket in the code. Let’s examine what looks like complicated code and break it down. We are selecting all the columns, select asterisks, from the Chef table. But we don’t want all the rows, so we have a wear class and we qualify the rows we want to return based on a condition. That condition is where the name is on the list. And here is where we type in the list that was actually returned if you recall from the previous subquery.

What am I going to get if I submit this query? I’ll get the names of the dishes, as well as the chef’s names that give me the detail I was looking for. Now, this is a lot of manual intervention. And if you recall, my favorite task is not spending time doing this work. If I go back to my code, I’ll explain why. Firstly, any time data values change, I need to rerun my subquery first, head over to the Results window, copy those values and paste them here. There is a lot of manual intervention going on. And the minute human intervention enters, I know I’m prone to error.

Let’s see how we can craft this query in a slightly smarter way. We are going to dig the entire subquery, and just paste it instead of pasting the data values. Job well done. So the same query, no change. The only change is taking my subquery and pasting it where earlier, I was pasting data values. Submit, and see if we get the same results. And the results are the same. So there we go. We learned a second way of joining our tables, growing in wide, and also made it very efficient.

Leave a Comment