Home > Supervisor Meetings, what i've been doing > Final Chapter – Next Steps

Final Chapter – Next Steps

Met with Pádraig to discuss next steps.
Recording

We decided to go along the route of trying to classify CFAs based on synthetic datasets. We realised that there are no synthetic datasets out there that model both contacts and clustering.

We decided to be clever with the Studivz dataset. We will run the Moses CFA over it. Then pick a random community, then build out the network based on connected communities.

The idea being that we can decide how many nodes we want, and keep going until we have enough.

 

I implemented this. The script takes an edge list, the community allocation by Moses. (see file in SVN)

The script picks a random community C0, then, while the total number of nodes is less than the max number of nodes, it picks the community (C1) connected to C0 that has the largest number of links to C0. Then, it picks one of these communities at random, and repeats the picking process until the total number of nodes has been reached or exceeded.

Initially, I did this four times, with a limit of 200 nodes and produced the following graphs, coloured by modularity class, and sized by betweeness centrality (sized linearly between 5 and 100 in gephi).

studivz-three-200-D

studivz-three-200-B

studivz-three-200-C

studivz-three-200-A

But then I got to thinking what are we actually trying to test here? It’s a bit vague to just see what happens in these datasets. We could be testing a number of things:

  • Which CFA works best for well connected communities
  • Which CFA works best for poorly connected communities
  • Are we testing a particular aspect of a network?
    • Density?
    • Avg. community size
    • Average network size

So I also decides to generate a set of nodes in poorly connected datasets, so adapted the algorithm to pick the communities with the lowest number of links. Resulting in the four below:

studivz-three-200-A-MIN

studivz-three-200-B-MIN

studivz-three-200-C-MIN

studivz-three-200-D-MIN

Also, perhaps we should be considering weighted links, perhaps the most time connected vs the least time connected?

Each of these subsets should be compared in terms of other network metrics to see if there is an effect on CFA performance.

awesome command to do this:

EXPERIMENT_GROUP=FINAL_EVALUATION DATASETS=studivz-three-200-A,studivz-three-200-B,studivz-three-200-C,studivz-three-200-D,studivz-three-200-A-MIN,studivz-three-200-B-MIN,studivz-three-200-C-MIN,studivz-three-200-D-MIN,mit-nov,mit-oct,cambridge,social-sensing,hypertext2009,infocom-2005,infocom-2006  && for DATASET in $(echo ${DATASETS} | tr "," "\n"); do php -f scripts/stats/DatasetCommunityLinkCountStats.php OUTPUT/${EXPERIMENT_GROUP}/edgelist-graphs/${DATASET}/edge_list.dat OUTPUT/${EXPERIMENT_GROUP}/communities/${DATASET}/Moses/no-global-parent/edge_list.dat.communities.dat  CSV ${DATASET} > OUTPUT/${EXPERIMENT_GROUP}/data/${DATASET}-DatasetCommLinkStats.txt; done
  1. Pádraig
    June 23rd, 2012 at 12:05 | #1

    So hopefully now we have two categories of network, one with densely connected communities and one loosely connected. One way to check this would be to look at the distribution of external contacts per time period per node compared with the internal distribution. I suppose it would be enough to just look at the ratio of the means of these distributions, i.e. average number of internal to external contacts.

    It may be though that your two alternative selection policies (min versus max) are very influenced by the starting point. From looking at the network diagrams it looks like studivz-three-200-B is sampled from a very dense part of the network and it looks denser than the other ‘max’ samplings.

    Anyway, it looks like we have a strategy for producing shed loads of ‘realistic’ contact network data.

    Pádraig.

  2. January 5th, 2026 at 01:12 | #2

    Yo, 8855betcassino has some slick games. The selection is wide, and I found a couple of new faves. Payouts were smooth too, which is always a plus. Give it a shot! 8855betcassino

  3. January 5th, 2026 at 21:05 | #3

    6rgamelogin makes getting into all the action a breeze. No messing about, just straight to playing, love it! Definitely try 6rgamelogin, its so efficient.

  4. January 17th, 2026 at 20:48 | #4

    Yo, decided to give 79k01 a shot after seeing it pop up online. Pretty straightforward sign-up and the site looks clean. Played a few hands of blackjack, and the experience was decent. Payouts seem fair. Definitely worth checking out if you’re on the hunt for a reliable platform. Find out more at 79k01.

  5. January 17th, 2026 at 20:48 | #5

    K8ccvn, alright! Let’s jump in and see what this site offers. Fingers crossed for a good range of betting options and some fair odds. Come and join, maybe we will win together: k8ccvn

  6. January 17th, 2026 at 20:48 | #6

    Hey K9ccvn, looks new and fresh! Hoping for a quick and secure platform with no lagging. If you are looking for a new place, this might be it: k9ccvn

  1. No trackbacks yet.