This Wednesday, at the Research Team Share-out EdLab Seminar, I will be
presenting some of the work the research team and I have done to make the best use of the Quuppa location tracking system.
I will focus on a paper we recently wrote titled "Cluster Analysis of Real Time Location Data - An Application of Gaussian Mixture Models" Breaking down the title, we are trying to identify groups or clusters within a snapshot of 2-dimensional (x,y) Quuppa location data by using a statistical concept called Gaussian Mixture Models. In my presentation I will also highlight the motivation of doing such an analysis and the future directions we hope to pursue.
The data and results of our analysis can look like the picture below.
The points represent individuals wearing the Quuppa tags and the blobby shapes are the results for what we came up with as the groups. This analysis allows us to gain insight into the group locations, shape, and physical size. These phyiscal parameters are useful to visualize this analysis.
While in the example above the groups are neatly divided, this is not always the case. Take a look at the example below without the blobby shapes and see if you can identify the size, shape and location of 4 different groups.
Is this similar to what you got?
As you can see it is not always obvious where a group might exist or who belongs to which group. Likewise to do this work by hand, and mark out the groups for millions of Quuppa data entries would be extremely tedious. Further, our program and analysis also allows us to quickly determine information about group membership (Who is where?) and group membership distribution (How many people are in each group?).
These parameters are very useful to analyze social dynamics and activity effectiveness during events such as gallery walks, group activities, or other events which may have participants in different stations or groups. In one application we analyzed data from the D&R Quuppa presentation in January where people moved through different stations featuring virtual reality programs. One interesting result we quickly found was that the activity that most people indicated as their favorite was also the one that people spent the most combined time at.
This type of information can be very useful information for creating educational content, organizing events, and understanding their effectiveness; This provides the motivation for our research.
Seeing the physical parameters of the groups such as location, size, and shape is interesting, but ultimately we would like to focus on only the group distributions and membership. Narrowing our focus would allow us to speed up the calculations we make. Group distributions along with group membership throughout time will give help us answer important questions such as "Which group was most popular when?", "How long did people spend in each group?", etc.
Currently, we are also constrained by having to identify how many groups we expect there to be. We can estimate this, by looking at how many stations, activities, or tables there are during an event. To avoid this we are also developing an expanded method which will not be constrained.
In my opinion this research will be very useful to educators, curators, and event organizers to learn about the effectiveness of their work. I also believe this research with its goal of creating knowledge and information is distinct from similar analysis of location data which focus on surveillance.
We are excited to say that our paper was accepted to the Educational Data Mining Conference in Wuhan, China.
I hope to present more on this topic in an engaging way and also answer any questions you may have at this Wednesday's seminar!