« Object models and data sets for a social network .. crawlers vs simulation .. | Main | Johnny Cash impersonates Elvis »

March 21, 2008

Object models and data sets for a social network .. a technique for data

following on from my last post .. Object models and data sets for a social network .. crawlers vs simulation ..

(read the previous email first to get the context)

I have been thinking of this last night and this may be a solution

The question is - Can we extapolate social network data based on existing data patterns?

We have three components
a) A knowledge base(typical size of friends, no of blog posts etc)
b) Parameterization (setup configuration to run the generator) and
c) The generation itself

We need a large volume of data to be relevent
We need parameters that mirror real life

Here is my plan
The objective is to 'clone' the transactions from a core set first and then apply the parameters(intelligence)

a) Create configuration tables (for instance a profiles table)

b) Create transactions tables(blog entries, facebook pokes etc) and
populate it with the base entries

c) Create a cartesian join between the profiles table and the blog
entries table. In a normal course of events, cartesian joins are not
desirable. However they are good to create a massively large number of
records very quickly
fror instance

select profile.profile_id, blogs.blog_id from profiles, blogs

If profiles has profile_id P1, P2
blogs has blog_id B1, B2

will give 4 rows
P1, B1
P1, B2
P2, B1
P2, B2

d) We thus get a large number of rows

e) We then 'apply' the rules as a series of update statements on the
base data(post cartesian join)

f) This gives us the 'real' data

g) To make this work, I plan to 'open source' the whole thing -
tables and more importantlyt the knowledge base.

h) So, I see many people contributing insights(A typical user on FB
has typically 100 friends on average, Myspace has 40 blog entries per
week that sort of thing)

i) So, we can now create a set of data based on parameters and its
all open sourced

thougts?
kind rgds
Ajit

Posted by ajit at March 21, 2008 11:36 AM



Trackback Pings

TrackBack URL for this entry:
http://www.opengardensblog.futuretext.com/mt-tb.cgi/811

Comments

Post a comment




Remember Me?