« | Who's Reading the RSS? |
» |
I was looking at the web server logs today while trying to sort out an issue with a spammer on the Milton village web site and I got to thinking about its RSS feed. One of the things that has been annoying me for a long while is that, unlike the other ways people read Milton News, I can't tell how many people read it via RSS.
I know how many subscribers I've got to the mailing list1 and how many followers @MiltonNews has2 but RSS is a bit of a black hole.
And then, while squinting at the logs, I had a thought. If I ran an appropriate grep
and awk
rune over the logs I could extract all the IP addresses which had accessed the RSS feed, pipe that through the magic of sort
, uniq
and finally wc -w
and, voila, a count of the unique IP addresses who had accessed the page.
This is however somewhat flawed as a method of counting unique readers. For a start some people, especially those on mobile devices, are continually changing IP addresses so they are going to count multiple times. Conversely some people (currently, until it's turned off soon) use Google Reader - and indeed I can see hits from IP addresses belonging to Google in the list. There's similar aggregators doing the same sort of thing. Plus of course the page is being indexed by various search engines, including Google.
The give away that I've got a problem is that at the moment I'm running the rune over access.log
and access.log.1
i.e. the two most recent log files, which together comprise 7-14 days of web server traffic depending on when the snapshot it taken. That gives me, at the time of writing, 66 IP addresses.
But if I run it over the entire mass of access.log*
files which go back to last June, the total is 656. So the total I get largely depends on how big my sample window is.
Hmmmm ...
The only good news is that I think I can safely conclude that the RSS feed is being read by some people, but I can't say with any confidence at all how many.
- 330 as I type.
- 125. For latest figures for both see this page where the totals are updated hourly and indeed via a bit of screen scraping I used to know how many people "liked" the Milton News page on Facebook before we dropped that.
Tags: linux, websites | Written 27/03/13 |
« | » |