I am trying to join two tables in SQL (Snowflake) that have hierarchical data. Table A contains rows with all possible levels of the hierarchy in each row, whereas Table B contains individual rows for each level of the hierarchy. Additionally, some rows in Table A may contain a higher level of data than Table B. In this case, the goal is to pick the highest matching level from Table B. A challenge is that the level needed in Table A isn't constant as the matching level in Table B varies.
Below are examples of each table:
Table A:
Row Level 1 Level 2 Level 3
1 Animal Mammal Dog
2 Animal Mammal Cat
3 Animal Reptile Lizard
4 Animal Reptile Snake
5 Tree Oak Live Oak
6 Tree Elm Cedar Elm
Table B:
Row Level 1 Level 2 Level 3 Category
1 Animal Mammal Dog D
2 Animal Reptile Lizard L
3 Animal Mammal M
4 Animal Reptile R
5 Animal A
6 Tree Oak Live Oak LO
7 Tree Oak O
8 Tree T
Desired Results:
Row Level 1 Level 2 Level 3 Category
1 Animal Mammal Dog D
2 Animal Mammal Cat M
3 Animal Reptile Lizard L
4 Animal Reptile Snake R
5 Tree Oak Live Oak LO
6 Tree Elm Cedar Elm T
In the above results table, rows 1, 3, and 5 matched on Level 3. Rows 2 and 4 matched on Level 2. Row 6 matched on Level 1. Below is the code I've tried that produced the closest results:
SELECT a.lvl_1, a.lvl_2, a.lvl_3, b.category
FROM TABLE_A a
LEFT JOIN TABLE_B b
ON COALESCE(b.lvl_3, b.lvl_2, b.lvl_1) IN (a.lvl_1, a.lvl_2, a.lvl_3)
The problem is that the resulting table shows all rows with matches at any level. For example:
Current Results:
Row Level 1 Level 2 Level 3 Category
1 Animal Mammal Dog D
2 Animal Mammal Dog M
3 Animal Mammal Dog A
4 Animal Mammal Cat M
5 Animal Mammal Cat A
...
If I check for equality at a specific level, the results are correct for that level, but NULL for the other levels:
SELECT a.lvl_1, a.lvl_2, a.lvl_3, b.category
FROM TABLE_A a
LEFT JOIN TABLE_B b
ON COALESCE(b.lvl_3, b.lvl_2, b.lvl_1) = a.lvl_3
Equality Results:
Row Level 1 Level 2 Level 3 Category
1 Animal Mammal Dog D
2 Animal Mammal Cat null
3 Animal Reptile Lizard L
4 Animal Reptile Snake null
5 Tree Oak Live Oak LO
6 Tree Elm Cedar Elm null
How can I join Table B with Table A to return the single Category of the highest level match? Any help is greatly appreciated. Thank you.
I am trying to join two tables in SQL (Snowflake) that have hierarchical data. Table A contains rows with all possible levels of the hierarchy in each row, whereas Table B contains individual rows for each level of the hierarchy. Additionally, some rows in Table A may contain a higher level of data than Table B. In this case, the goal is to pick the highest matching level from Table B. A challenge is that the level needed in Table A isn't constant as the matching level in Table B varies.
Below are examples of each table:
Table A:
Row Level 1 Level 2 Level 3
1 Animal Mammal Dog
2 Animal Mammal Cat
3 Animal Reptile Lizard
4 Animal Reptile Snake
5 Tree Oak Live Oak
6 Tree Elm Cedar Elm
Table B:
Row Level 1 Level 2 Level 3 Category
1 Animal Mammal Dog D
2 Animal Reptile Lizard L
3 Animal Mammal M
4 Animal Reptile R
5 Animal A
6 Tree Oak Live Oak LO
7 Tree Oak O
8 Tree T
Desired Results:
Row Level 1 Level 2 Level 3 Category
1 Animal Mammal Dog D
2 Animal Mammal Cat M
3 Animal Reptile Lizard L
4 Animal Reptile Snake R
5 Tree Oak Live Oak LO
6 Tree Elm Cedar Elm T
In the above results table, rows 1, 3, and 5 matched on Level 3. Rows 2 and 4 matched on Level 2. Row 6 matched on Level 1. Below is the code I've tried that produced the closest results:
SELECT a.lvl_1, a.lvl_2, a.lvl_3, b.category
FROM TABLE_A a
LEFT JOIN TABLE_B b
ON COALESCE(b.lvl_3, b.lvl_2, b.lvl_1) IN (a.lvl_1, a.lvl_2, a.lvl_3)
The problem is that the resulting table shows all rows with matches at any level. For example:
Current Results:
Row Level 1 Level 2 Level 3 Category
1 Animal Mammal Dog D
2 Animal Mammal Dog M
3 Animal Mammal Dog A
4 Animal Mammal Cat M
5 Animal Mammal Cat A
...
If I check for equality at a specific level, the results are correct for that level, but NULL for the other levels:
SELECT a.lvl_1, a.lvl_2, a.lvl_3, b.category
FROM TABLE_A a
LEFT JOIN TABLE_B b
ON COALESCE(b.lvl_3, b.lvl_2, b.lvl_1) = a.lvl_3
Equality Results:
Row Level 1 Level 2 Level 3 Category
1 Animal Mammal Dog D
2 Animal Mammal Cat null
3 Animal Reptile Lizard L
4 Animal Reptile Snake null
5 Tree Oak Live Oak LO
6 Tree Elm Cedar Elm null
How can I join Table B with Table A to return the single Category of the highest level match? Any help is greatly appreciated. Thank you.
Share Improve this question asked 1 hour ago j.smithj.smith 372 silver badges7 bronze badges 1- Please tag the RDBMS you are using... – Dale K Commented 1 hour ago
1 Answer
Reset to default 0Finding categories using joins would be appropriate to perform in the following procedure:
- Temporary table 1. All columns of table_a, plus add a column containing category values that match completely up to level3
- Temporary table 2. For rows in temporary table 1 where category is null, find and set categories matching up to level2
- Temporary table 3. For rows in temporary table 2 where category is null, find and set categories matching up to level1
On the other hand, in this situation it also seems possible to apply the coalesce() function to possible categories without using joins:
select
*,
coalesce(
(select category from table_b b where b.level_1 = a.level_1 and b.level_2 = a.level_2 and b.level_3 = a.level_3),
(select category from table_b b where b.level_1 = a.level_1 and b.level_2 = a.level_2 and b.level_3 is null),
(select category from table_b b where b.level_1 = a.level_1 and b.level_2 is null and b.level_3 is null)
) category
from table_a a