So Apache logs for web traffic record everything at the row level, when you visit a web site, it records your IP address, date/time, referrer data, etc. etc. etc. / and it is basically limited like that. When you click on another Link, it records another row of the same information basically with a different time or maybe if you visit a few days later a different date/time and whatever page you clicked on… so now you have 2 rows of data, and maybe more for different days.
A BIG However!
You can extrapolate a lot more information simply by taking the Apache text logs and inserting that into a database!
Why ?
Because then, you can use SQL to aggregate the information and track the user to see what is happening on your site. Most people think that you need Google Analytics and while it is true that they record more Web Browser behavior – because they own Chrome and Google Analytics – this is not True, here is an example of a SQL that can do that for you!
WITH ordered AS (
SELECT ip, path, ts,
LAG(ts) OVER (PARTITION BY ip ORDER BY ts) AS prev_ts
FROM access_log
WHERE site = ‘your_web_site_name.com’
AND path NOT LIKE ‘%.php’
AND path NOT LIKE ‘%.xml’
),
sessions AS (
SELECT ip, path, ts,
SUM(CASE WHEN prev_ts IS NULL OR TIMESTAMPDIFF(MINUTE, prev_ts, ts) > 30
THEN 1 ELSE 0 END) OVER (PARTITION BY ip ORDER BY ts) AS session_id
FROM ordered
),
numbered AS (
SELECT ip, session_id, path,
ROW_NUMBER() OVER (PARTITION BY ip, session_id ORDER BY ts) AS step
FROM sessions
)
SELECT n2.path AS second_page, COUNT(*) AS visits
FROM numbered n1
JOIN numbered n2 ON n1.ip = n2.ip
AND n1.session_id = n2.session_id
AND n2.step = 2
WHERE n1.step = 1 AND n1.path = ‘/’
AND n2.path NOT IN (‘/favicon.ico’, ‘/robots.txt’, ‘other_filters’)
AND n2.path NOT LIKE ‘%xmlrpc%’
AND n2.path NOT LIKE ‘%wp-%’
AND n2.path NOT LIKE ‘%.ico’
AND n2.path NOT LIKE ‘%.txt’
GROUP BY n2.path
ORDER BY visits DESC
LIMIT 10
This will create a dataset of basically what happens after the unique IP enters your site, where did they go next, what they clicked on next…
After, you simple convert this information into a Graphic using maybe Metabase or even Tableau or any other Dash-boarding tool – they are a dime a dozen now, all offering different things.
This one is super important because it tells you if the person just clicks and leaves or what happens. You can look at this and start improving your site and then you simply look at the Funnel Rate, is it improving or getting worse ?
If you need something like that, you can hire me as a Consultant, Contact me!