Zend certified PHP/Magento developer

Which kinds of streaming/browsing behavioural data are most valuable and accessible? [closed]

The system is privately owned but accessible for users without account in public spaces. Users will have access 24/7 to a locked application I will make for a defined amount of time. Just assume users will not be able to physically leave for a certain amount of time and this is their multimedia device for the given time.

My goal is to collect data about browsing/streaming behaviour while keeping personal information to a minimum, non-existent if possible. All data will go to a daemon and back to the central server, after which I will try and apply algorithms to find valuable information. The goal of the collected data is to make educated guesses about age, gender but most importanty preferences in entertainment media and other information. This information will ultimately be used to be sold exclusively to parties in the entertainment industry.

What kind of data could and should I collect from browsers and (possibly built in) youtube and netflix applications? Data collected is to be used for algorithms. I have some hunches but I would like to have some discussion about the topic.

This is mainly hypothetical but I will code parts and use data in/from a contained environment for a graded project.

Basic GUI example of locked application