Data Dojo Würzburg 10
March 2022
- When: Thursday, March 10th, 2022 at 18:00pm
- Where: Zoom
- Info: DataDojo Website, Repo
Please add your name to the list (click the pen icon at the top left to edit) if you plan to come. And please remove it if you can not make it. Feel free to add your preferred tool or programming language.
- Markus (R or julia)
Local results of the German Federal Election from Würzburg (Stadt and Landkreis) together with demographic information (e.g. age structure): Stadt, Landkreis
This time the data will be provided as a pre-processed single tidy table.
Specific task for today
- Plot the distribution of mean age across Locations
- Is there a correlation between mean age and votes for any party?
- Independent of point 2 😜 if you weigh each vote by age, which party has the lowest/highest mean age?
Question Pool:
- Generic
- What kind of information is stored in the table(s)?
- How much data is missing?
- Is the dataset clean or are there any clear outliers?
- How can the different datasets be combined?
- How to visualize the results in a suitable way?
- Specific
- Overview of voting behavior: how does voting behavior vary by location? (General trends, total variability, …)
- Overview of demographic info: how does age/gender distribution vary by location? (General trends, total variability, …)
- Which party has the strongest (positive/negative) correlation with age?
- Which party has the strongest (positive/negative) correlation with gender?
- Can we predict voting behavior from age/gender distribution? (or vice-versa)
- Further Ideas
- Show results with district resolution on an interactive map (e.g using these shapes)
Collaborative Tools and Workflow
For Notebooks (R, python, julia, js, …) with real time collaboration CoCalc seems to be the best option right now. It worked great the last couple of times so we’ll stick to it for now. You need to register an account there (it is free).
Future Suggestions
