pub history site audit

I thought as it was late at night, and after most of  a bottle of wine, I thought I would take the plunge and see where there are about fifty thousand pages on my site.

I would like to reduce this number, and also I am unclear as to why so few people visit the site / s!

First, create a new site mapping of the site, this takes a few seconds.

find `pwd` > mall.txt
more mall.txt | wc -l

gives me 68613 – I think that is line numbers

Then look for those files which are shtml only

cat mall.txt | grep shtml > malls.txt
more malls.txt | wc -l

gives me 48060 – I think that is line numbers

 

I am now searching on this latter file only (with 48000 entries)

So, to find specific words in the file, this is just a test for various counties / partial words

more malls.txt | grep Essex | wc -l #4110
more malls.txt | grep Hamp | wc -l #3239
more malls.txt | grep Sussex | wc -l #1821
more malls.txt | grep Surrey | wc -l #1835
more malls.txt | grep Kent | wc -l #5558
more malls.txt | grep London | wc -l #12428
more malls.txt | grep Beer | wc -l #451
more malls.txt | grep Midd | wc -l #1576
more malls.txt | grep Bed | wc -l #377
more malls.txt | grep Buck | wc -l #1709
more malls.txt | grep Camb | wc -l #1850
more malls.txt | grep Corn | wc -l #276
more malls.txt | grep Glouc | wc -l #2660
more malls.txt | grep Oxf | wc -l #1528
more malls.txt | grep Somer | wc -l #606

more malls.txt | grep Berks | wc -l #2086

Surprisingly, I thought London would be a lot bigger, quite an eye opener with only 12500 entries!.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s