4

I am new to database, and below is one of my query with it's explanation, I feel it's too complex. It works, but I am worried the design is bad.

Provide a list of customer information for customers who purchased anything written by the most profitable author in the database.

SELECT E.id,NAMES.fname, NAMES.lname, E.address, E.city, E.state,E.country,    E.email, E.phone, E.postalcode
FROM ENTITY AS E, (SELECT DISTINCT T.customer_id
               FROM (SELECT  DISTINCT O.order_id
                     FROM ORDERITEM AS O
                     WHERE O.bid IN (SELECT DISTINCT A.b_id
                                     FROM AUTHOR AS A
                                     WHERE A.name_id IN (SELECT DISTINCT NA.name_id
                                                         FROM (SELECT A.name_id, A.b_id
                                                               FROM (AUTHOR AS A JOIN NAMES AS N ON (A.name_id = N.id))
                                                               GROUP BY A.name_id) AS NA,(SELECT B.b_id AS bid, max(C.total_quantity * B.price), B.title
                                                                                          FROM BOOK AS B, (SELECT bid, SUM(quantity) AS total_quantity
                                                                                                           FROM ORDERITEM AS O
                                                                                                           GROUP BY O.bid) AS C
                                                                                            WHERE B.b_id = C.bid) AS BD
                                                         WHERE NA.b_id = BD.bid))) AS OID JOIN ORDER_TRANSACTION AS T ON (OID.order_id = T.order_id)) AS TID, NAMES
WHERE TID.customer_id = E.id AND
NAMES.id = E.name_id

And This is my design, look very simple.

enter image description here

bagMan
  • 163
  • 6

3 Answers3

5

The answer is: divide to conquer! Each sub-query should become a query (or a view) of its own. Use sub-queries sparingly. Use JOINs instead.

You would need at least one query (using JOINs) for:

  • AUTHOR

  • CUSTOMER

  • BOOK

  • ...

Then you can test your queries or views independently (very useful if you are having trouble).

fralau
  • 201
  • 1
  • 3
2

Overall, your design looks pretty sound. Keep in mind that I have a limited understanding of your subject matter, and only a vague idea of the way you intend to use this data, aside from the query challenge you posted.

There are a couple of places where the design looks strange to me. In particular, the table you have called "NAMES" looks to me like it has one entry for each author. I would have called that table AUTHOR. And I would have called the table you called AUTHOR something like WRITTEN_BY. It's really a junction table between books and authors. But if your choice of table names makes more sense to you, then keep what makes sense to you.

Also, it looks like NAMES is missing a field called entity_id that would reference an entry in ENTITY. Compare with the CUSTOMER table. I'm not sure what name_id is doing in the ENTITY table. It looks like the relationship is backwards, but I'm not sure.

When you go to actually build the database, you will need to add features specific to sqlite. That's not my DBMS, so I'll stay out of that. I will say that you'll do well to declare a primary key for each table. In some cases, this will have to be a compound key, as in your AUTHOR table. You'll also do well to declare any keys that reference other tables as foreign keys. It's well worth the overhead in order to keep you from mismanaging your data.

I'll note in passing that your design of the table ENTITY is an instance of a superclass/subclass pattern and a design technique called class table inheritance. ENTITY (describing contacts) is a superclass, and CUSTOMER and NAMES are both subclasses. You've done a good job here, but knowing the buzzwords may help you find additional reading material.

I had to make a couple of assumptions in order to make sense out of the query challenge you posed. First, I assume that "most profitable" means the one with the most dollar (or other currency) volume of sales. Second, I assume that in the unlikely event of a tie for first place, you'll want customer lists for all the top authors.

Your query is unnecessarily complex, but even if it were ultimately simplified it would still be a fairly complex deal. There are three problems in one here. First is to figure out the dollar volume of the most profitable author. Second, is to figure out a list of NAMES that meet this dollar volume, and third is to join those NAMES and AUTHORS and BOOKS, and ORDER_ITEMS and ORDER_TRANSACTIONS and CUSTOMERS and ENTITY, so as to pickup the contact info for the customers. If I have enough time, I'll take a shot at it, but I don't have the facilities here to try to debug my effort.

Last what is simple to the reader may not be simple to the query optimizer. I don't know how good the sqlite optimizer is at finding an optimal strategy. For some other database products, the difference between an optimal strategy and a mediocre one can be a hundred to one in terms of performance. There are things a database designer can do to affect the optimizer. But in general, it's better to let the optimizer do the optimizing.

Walter Mitty
  • 4,416
  • 21
  • 22
1

I tried to make sense of this and this is a far as I got
This needs some clean up

  SELECT E.id,NAMES.fname, NAMES.lname, E.address, E.city, E.state,E.country, E.email, E.phone, E.postalcode
  FROM ENTITY AS E
, ( SELECT DISTINCT T.customer_id
    FROM ( SELECT DISTINCT O.order_id
           FROM ORDERITEM AS O
           WHERE O.bid IN ( SELECT DISTINCT A.b_id
                            FROM AUTHOR AS A
                            WHERE A.name_id IN ( SELECT DISTINCT NA.name_id
                                                 FROM ( SELECT A.name_id, A.b_id
                                                        FROM ( AUTHOR AS A 
                                                               JOIN NAMES AS N 
                                                                ON (A.name_id = N.id)
                                                             )
                                                         GROUP BY A.name_id
                                                      ) AS NA 
                                             , ( SELECT B.b_id AS bid, max(C.total_quantity * B.price), B.title 
                                                 FROM BOOK AS B
                                                 , ( SELECT bid, SUM(quantity) AS total_quantity 
                                                     FROM ORDERITEM AS O
                                                     GROUP BY O.bid
                                                   ) AS C  
                                                 WHERE B.b_id = C.bid 
                         ) AS BD
                         WHERE NA.b_id = BD.bid
           )
        )
) AS OID 
JOIN ORDER_TRANSACTION AS T ON (OID.order_id = T.order_id)
) AS TID, NAMES
WHERE TID.customer_id = E.id   
AND NAMES.id = E.name_id
paparazzo
  • 5,048
  • 1
  • 19
  • 32