sql help for a lazy guy

Areeda · Feb 14, 2010

I know some of you can do this off the top of your head, so while I read some interesting aviation posts would you please help. Don't spend time researching it, I'll do that if it's not obvious. I'd like some standard sql but if it's easier this case is mysql.

In my approach plate database there is a field pdf which has the filename. Several pdfs are not unique such as departure/arrival procedures and takeoff minimums.

I'd like to know how many unique pdfs and how many times the multiple ones are used.

Thanks,

Joe

CJones · Feb 14, 2010

select distinct pdf as PDF, count(*) as countPDF from table

JeffDG · Feb 14, 2010

CJones said:
select distinct pdf as PDF, count(*) as countPDF from table

You may also need a "group by pdf" at the end of that as well. If you want it sorted, include the appropriate "order by" clause as well.

Areeda · Feb 14, 2010

CJones said:
select distinct pdf as PDF, count(*) as countPDF from table

Hmmm, not quite as a matter fact I don't get it.

Code:

[B]mysql> select distinct link as LINK, count(link) as cnt from ttpidx2;
+----------------------------------------------+-------+
| LINK                                         | cnt   |
+----------------------------------------------+-------+
| http://www.naco.faa.gov/d-tpp/1002/SW1TO.PDF | 17558 | 
+----------------------------------------------+-------+
1 row in set (0.01 sec)
[/B]

Here's an example of some records

Code:

mysql> select link from ttpidx2 limit 10; +-------------------------------------------------+ | link | +-------------------------------------------------+ | http://www.naco.faa.gov/d-tpp/1002/SW1TO.PDF | | http://www.naco.faa.gov/d-tpp/1002/SW1ALT.PDF | | http://www.naco.faa.gov/d-tpp/1002/05054R11.PDF | | http://www.naco.faa.gov/d-tpp/1002/05054R29.PDF | | http://www.naco.faa.gov/d-tpp/1002/05054V29.PDF | | http://www.naco.faa.gov/d-tpp/1002/SW1TO.PDF | | http://www.naco.faa.gov/d-tpp/1002/SW1ALT.PDF | | http://www.naco.faa.gov/d-tpp/1002/00018IL2.PDF | | http://www.naco.faa.gov/d-tpp/1002/00018R2.PDF | | http://www.naco.faa.gov/d-tpp/1002/00018R20.PDF | +-------------------------------------------------+ 10 rows in set (0.00 sec)

add a group by, then make it a compound statement and I think I can get what I want.

Thanks Chris you got me on the way. I'll post the final mystic incantation.

Joe

CJones · Feb 14, 2010

CJones said:
select distinct pdf as PDF, count(*) as countPDF from table

JeffDG said:
You may also need a "group by pdf" at the end of that as well. If you want it sorted, include the appropriate "order by" clause as well.

Yep.. You're right..

select distinct pdf as PDF, count(*) as CNT from table group by PDF

That will return all of the unique filenames and the number of times each one of them appears.

Countdown until Jesse shows us how this can all be done with three characters and a paper clip.... 3.... 2....

JeffDG · Feb 14, 2010

CJones said:
Yep.. You're right..

select distinct pdf as PDF, count(*) as CNT from table group by PDF

That will return all of the unique filenames and the number of times each one of them appears.

Countdown until Jesse shows us how this can all be done with three characters and a paper clip.... 3.... 2....

The "distinct" is redundant. Group By/Count(*) implies a distinct and is not needed, although I'm not that up on mysql, most of my stuff is MS SQL Server.

Areeda · Feb 14, 2010

OK I got it now. It's a subquery not a compound statement.

Code:

mysql> select count(*),sum(cnt) from 
> (select distinct link as LINK, count(link) as cnt from ttpidx2 group by link ) 
> as t2 where cnt > 1;
+----------+----------+
| count(*) | sum(cnt) |
+----------+----------+
|      339 |     4877 | 
+----------+----------+
1 row in set (0.17 sec)

So in case you were wondering out of 17,558 documents listed in the tpp files at naco.faa.gov there are 339 that are duplicated representing 4,877 of them.

Thanks everybody.
I'll try dropping the distinct.

Joe

Areeda · Feb 14, 2010

Yes distinct is redundant in mysql also. Final query is:

select count(*),sum(cnt) from (select link as LINK, count(link) as cnt from ttpidx2 group by link ) as t2 where cnt > 1;

Joe

gprellwitz · Feb 14, 2010

Areeda said:
Yes distinct is redundant in mysql also. Final query is:

select count(*),sum(cnt) from (select link as LINK, count(link) as cnt from ttpidx2 group by link ) as t2 where cnt > 1;

Joe

Not that it's really needed at this point, but

Code:

select link as LINK, 
count(link) as cnt 
from ttpidx2 
group by link
having count(link) > 1

would be simpler. (The count(link) in the last line may need to be "cnt")

Edit: Oops, that's the answer to a question you haven't asked, which ones are used more than once!

Areeda · Feb 14, 2010

Cool, I am learning sql as needed. You probably can tell.

Joe

murphey · Feb 14, 2010

JeffDG said:
The "distinct" is redundant. Group By/Count(*) implies a distinct and is not needed, although I'm not that up on mysql, most of my stuff is MS SQL Server.

Nah -- keep it ANSI SQL and there's no problem with the flavor.

murphey · Feb 14, 2010

Areeda said:
OK I got it now. It's a subquery not a compound statement.

Code:

mysql> select count(*),sum(cnt) from > (select distinct link as LINK, count(link) as cnt from ttpidx2 group by link ) > as t2 where cnt > 1; +----------+----------+ | count(*) | sum(cnt) | +----------+----------+ | 339 | 4877 | +----------+----------+ 1 row in set (0.17 sec)

So in case you were wondering out of 17,558 documents listed in the tpp files at naco.faa.gov there are 339 that are duplicated representing 4,877 of them.

Thanks everybody.
I'll try dropping the distinct.

Joe

Makes sense - you want to grab all the records that are duplicates, THEN count them.

Ah, the days of teaching those young things all the various incantations...of course they all freaked at normal forms...

sql help for a lazy guy

Areeda

Pattern Altitude

CJones

Final Approach

JeffDG

Touchdown! Greaser!

Areeda

Pattern Altitude

CJones

Final Approach

JeffDG

Touchdown! Greaser!

Areeda

Pattern Altitude

Areeda

Pattern Altitude

gprellwitz

Touchdown! Greaser!

Areeda

Pattern Altitude

murphey

Touchdown! Greaser!

murphey

Touchdown! Greaser!