Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build a simple model of the 2016 presidential election #34

Open
chrisdick14 opened this issue Feb 13, 2017 · 13 comments
Open

Build a simple model of the 2016 presidential election #34

chrisdick14 opened this issue Feb 13, 2017 · 13 comments

Comments

@chrisdick14
Copy link
Contributor

We have a lot of data put together about the 2016 election, let's build some simple models trying to explain the outcomes so that we can discuss further.

Either of these notebooks would be a good start (if you are working in R):

https://github.com/Data4Democracy/election-transparency/blob/master/notebooks/r-notebooks/model_2016_presresults.Rmd

or

https://github.com/Data4Democracy/election-transparency/blob/master/notebooks/r-notebooks/basic-viz.Rmd

@jenniferthompson
Copy link
Contributor

Happy to keep working on this.

@chrisdick14
Copy link
Contributor Author

Great @jenniferthompson! This is a big one, so others are definitely welcome, want to get several different ideas down and start comparing.

@pghosh
Copy link

pghosh commented Feb 15, 2017

I will jump in too. Will be using python though.

@chrisdick14
Copy link
Contributor Author

Great @pghosh! Glad to see some python in the group!

@chrispelkey
Copy link
Contributor

I'd like to jump in and do some experimenting too. Are we looking at all types of models, sticking with more simple models like logistic, or just seeing what works best?

Are there basic goodness of fit stats that we are going to look at to compare? AIC, BIC, adjusted r-squared?

@chrisdick14
Copy link
Contributor Author

@chrispelkey, I went with basic just to get people started. I would say we can start looking at all types of models. Also, take a look here near the bottom of the document to get some further ideas of what we were thinking about modeling and trying to answer. At the end of the day, we are trying to get a good model fit, but some of the most interesting information (at least for me) is going to be what are the strong predictors of a Trump or Hillary win in a county, and where do we mis-classify?

As for model comparison I am of the mind that you usually need to use more than one measure to explain the pros and cons of each methods fit. But, that is just my view of model selection, so it is always up for discussion.

@jenniferthompson
Copy link
Contributor

+1 to @chrisdick14. My plan was to start with the predictors listed here, do a redundancy analysis and start with a logistic model for whether each county "picked the winner."

@fhollenbach
Copy link

I'd be happy to work on this for a bit. I have been kind of out of the loop the past couple weeks, but I will try to get some work done on this on the coming weekend.

@chrisdick14
Copy link
Contributor Author

Great! Thanks @fhollenbach! Impossible to have too many of us working on this.

@chrispelkey
Copy link
Contributor

I did a first stab at building three simple logistic models. I also don't really know how to work GitHub (this is my first project), so I'm not sure if I uploaded the right way.

@chrisdick14
Copy link
Contributor Author

@chrispelkey, did you submit a Pull Request for this?

@chrispelkey
Copy link
Contributor

@chrisdick14 I think I may have just done it right, finally...

@chrisdick14
Copy link
Contributor Author

Yep, I see a PR now. One of us will look at this today/tonight!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants