It is a kind of obvious statement that the motor of every social-networking site are content creators. Each owner of social-networking site knows, that it is only a machine what he provides, leaving the “stream of life” in the hands (and keyboards) of the most active users. Nothing says more than digits – my research shows that only 0.5% of all users of my S-N site are responsible for 38% content created!

From the business point of view it is critical to have such users. The situation when everybody wants to eat, but there is nobody to plant crops the result is starvation for the most of the society. It is also said that valuable content has magnetism within, attracting both users and search engines.

Hunting content creators should be high on the list of TO DO things after starting UCC website. Connecting dots, content creating in S-N and my interests in data mining, resulted with an idea to use data mining to discover users, who might be better than average content suppliers.

How to do it, having 8-years-old Internet board database, full of profile information, over 3200 users and over 115k posts? How will it affect the life of society?  What is the reliability of the research? And finally, what is the point (where are money)?

As usual, a lot of questions and answers given in probability measure. Revealing next part of the picture in the following part.

Part 2

As I’ve written in previous part above – content creators part 1, discovering ubercreators and exploiting this knowledge should be an important part of the development of every social-networking site.

My project (idea) is to set up a system to find content creators in functioning Internet board, using data mining algorithms. Some details:

  • database (MySQL) with over 3k users and describing parameters (about 70),
  • selection of the parameters describing users must be executed (manual – technically it comes to selection of the tables in the database, the process could be automated if necessary)
  • Weka is used as a set of classifiers and clustering algorithms (it is necessary to prepare data for both program and algorithm)

Content creating in discussion board is not really complex issue. Although it is difficult to evaluate value of the messages, in most cases it is not even necessary. It is enough to eliminate obvious cases of spamming and just let the snowball rolling down the hill.

In the certain moment, discovering users with hidden potential to create valuable content can give evolving society a serious boost. Giving an algorithm set of users with parameters, with an emphasis on those parameters describing activity and “creative spirit”, algorithm does the rest of the job, clustering users into groups with high level of similarity. The point is to use results of classification to give positive feedback to possible creators, to exploit potential.

The most reliable way to measure results is implementing model in real-life system. However, it is also necessary to try some modelling, because walking in the dark without even predicting (flashlight) if it is going to succeed is unacceptable in every business. Success means in this case having quick development of the network society with a visible grow of the valuable content and SEO parameters.

Content creators in social-networking sites part 1

Next chapter covers the issue of the chosen parameters, algorithm and modelling.

Article sponsored by No 1 Southampton Removals, best removals in Southampton

November 17th, 2016

Posted In: hunting content, web content, web mining, YouTube

Leave a Comment