# Miscellaneous > Structured Query Language (SQL) >  Massive query, where NOT IN does not appear to work?

## azcn2503

Hello

I'll keep it to the point so we can get down to the nitty gritty!



Table: tags
Fields: tid INT, tname VARCHAR

Table: uploads
Fields: uid INT, ufile VARCHAR

Table: tags_upoads
Fields: tuid INT, tid INT, uid INT



Step 1: From a webpage, a user uploads an image. The filepath to the upload is stored in the uploads table.

Step 2: A user can then assign tags to an upload to describe its data. When a tag is submitted to the server, it checks against the database to see if it already exists, and if it doesn't it will create a new one.

Step 3: Then the server will link those tag ID's to the upload ID using the tags_uploads table. So, uploads can have many tags.



For reporting purposes, I would like to generate a result from the SQL server that meets the following conditions:

* Shows distinct uploaded filenames
* Only if the tag ID matches a certain tag number(s)
* Only if the tag ID does not match other certain tag number(s)
* Ordered by the popularity of the tags selected (number of times the tag ID has been assigned to an upload ID)

However, imagine now that some uploads are tagged with a tag ID of 5 AND 6, but I want to show the uploads that are tagged 5 and exclude the uploads that are tagged with 6. How is that done?

Imagine this data set in tags_uploads for this example:
tuid=1, uid=1, tid=5
tuid=2, uid=1, tid=6
tuid=3, uid=2, tid=5
tuid=4, uid=2, tid=6
tuid=5, uid=3, tid=5

In this example, uid's 1 and 2 are tagged with tid's 5 and 6. Only uid 5 is tagged with just 5 on its own.

However these results need to be pulled back as part of an extremely large query which selects just distinct uploads, which is as follows:



```
SELECT uploads.uid, uploads.ufile, tu.tid, counter.tidcount
FROM uploads
INNER JOIN (

SELECT DISTINCT tags_uploads.uid, tags_uploads.tid
FROM tags_uploads
WHERE tags_uploads.tid
IN ( 5 ) 
AND tags_uploads.tid NOT 
IN ( 6 ) 
GROUP BY tags_uploads.uid
) AS tu ON uploads.uid = tu.uid
INNER JOIN tags ON tags.tid = tu.tid
INNER JOIN (

SELECT tags_uploads.tid, COUNT( * ) AS tidcount
FROM tags_uploads
WHERE tags_uploads.tid
IN ( 5 ) 
AND tags_uploads.tid NOT 
IN ( 6 ) 
GROUP BY tags_uploads.tid
) AS counter ON tu.tid = counter.tid
ORDER BY counter.tidcount DESC , uploads.uid DESC
```

If I remove the code that says _AND tags_uploads.tid NOT IN (6)_ from both subqueries, then I receive the exact same set of results.

Please can somebody explain what I am missing here, and what is required for this to now work?

I feel that this is an exceptionally complex question and I really will put my thanks out to the person that can help me with this. I'd even consider payment in some form.



Thanks and good luck  :Smilie:

----------


## azcn2503

Perhaps a better way to visualise the data would be like this:

uploads
uid = 1, ufile = redapple1.jpg
uid = 2, ufile = redapple2.jpg
uid = 3, ufile = greenapple.jpg

tags
tid = 1, tname = apple
tid = 2, tname = red
tid = 3, tname = green

tags_uploads
tuid = 1, tid = 1, uid = 1
tuid = 2, tid = 1, uid = 2
tuid = 3, tid = 1, uid = 3
tuid = 4, tid = 2, uid = 1
tuid = 5, tid = 2, uid = 2
tuid = 6, tid = 3, uid = 3

So we have 3 pictures of apples. All 3 are tagged as apple. 2 are tagged as red and one is tagged as green.

If I wanted to show pictures of apples, but not red apples, I think the query should be this:



```
SELECT uid FROM tags_uploads WHERE tid IN (1) AND tid NOT IN (2)
```

Which should bring back just one result of the green apple, right? Well, it brings back all results. So I believe I am missing some crucial stage of logic here!

Again, thanks in advance for helping out a poor soul.

----------


## azcn2503

Also in the above, if you just run a query to show NOT apple's, you still get green's and red's, which are apples  :Wink:

----------

