Report

Improving election prediction internationally

See allHide authors and affiliations

Science  03 Feb 2017:
Vol. 355, Issue 6324, pp. 515-520
DOI: 10.1126/science.aal2887

eLetters is an online forum for ongoing peer review. Submission of eLetters are open to all. eLetters are not edited, proofread, or indexed.  Please read our Terms of Service before submitting your own eLetter.

Compose eLetter

Plain text

  • Plain text
    No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Author Information
First or given name, e.g. 'Peter'.
Your last, or family, name, e.g. 'MacMoody'.
Your email address, e.g. higgs-boson@gmail.com
Your role and/or occupation, e.g. 'Orthopedic Surgeon'.
Your organization or institution (if applicable), e.g. 'Royal Free Hospital'.
Statement of Competing Interests
CAPTCHA

This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.

Image CAPTCHA
Enter the characters shown in the image.

Vertical Tabs

  • RE: Beyond Data: An Hierarchical Description for Prediction from the Viewpoint of Information Quality System
    • Cedric Fan, Professor, MIT Information Quality Program- Data Quality & Info Security Lab, Nanjing Tech University

    In the REPORT “Improving election prediction internationally” (Science 3 February 2017: Vol. 355 no. 6324 pp. 515-520), Ryan Kennedy et al. (1) used global data to improve election prediction internationally. Their results suggested that global elections can be successfully modeled and that they are likely to become more predictable as more information becomes available in future elections.

    Their finding is quite interesting and useful which can be applied to many similar prediction areas of social systems. However, an underlying analysis can still be carried on. The hierarchical model like DIKW pyramid (Fig.1) can be used to describe the prediction system. For the prediction, we should know something in high level by a direct assumption method or a bottom-to-top deduction method. For example, for the information prediction (2) (3) (4), we need know something in knowledge level; for the knowledge prediction (5), we need know something in wisdom level; and for the wisdom prediction (6), we need know something in corresponding higher level.

    Fig. 1. Hierarchical description of prediction

    To judge the effect of prediction, the structure like information quality system can be adopted (Fig.1). DIKW quality can be hierarchically defined as the useful proportion of current level quantity, which can contribute to the corresponding higher level DIKW quantity. The data quality is the useful proportion of data quantity, which can contribute to information quantity...

    Show More
    Competing Interests: None declared.
  • Ensemble methods significantly improve prediction

    Ensemble methods (1) use multiple learning algorithms to always obtain better predictive performance than could be obtained from any of the constituent learning algorithms including decision trees. The algorithm proposed by Kennedy et al. (2) can be significantly improved by ensemble methods. There are a variety of ensemble methods including Adaboost, Randomforest, Extratree, Extratrees, Gradient-boosting, Bagging, and Voting-classifier. In Voting-classifier ensemble method, decision trees and other algorithms can be used for improving the prediction quality.

    References:
    1. https://en.wikipedia.org/wiki/Ensemble_learning
    2. Ryan Kennedy, et al., "Improving election prediction internationally", Science 03 Feb 2017: Vol. 355, Issue 6324, pp. 515-520

    Competing Interests: None declared.
  • Ensemble methods can improve election prediction

    According to Wikipedia (1), in statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. The single algorithm proposed by Kennedy et al. (2) can be significantly improved by ensemble methods. There are a variety of ensemble methods including Adaboost, Randomforest, Extratree, Extratrees, Gradient-boosting, Bagging, and Voting-classifier. With the advent of open source, scikit-learn library can be easily used for realizing ensemble machine learning (3). In red wine artificial sommelier project, Voting-classifier can improve the red wine quality prediction by 5 points (4).

    References:
    1. https://en.wikipedia.org/wiki/Ensemble_learning
    2. Ryan Kennedy, et al., "Improving election prediction internationally", Science 03 Feb 2017: Vol. 355, Issue 6324, pp. 515-520
    3. https://en.wikipedia.org/wiki/Scikit-learn
    4. Y. Takefuji, "Ensemble machine learning", Kindaikagaku 2016.

    Competing Interests: None declared.
  • RE: Improving election prediction internationally
    • Paul H Lee, Statistician, Hong Kong Polytechnic University

    Decision trees are machine learning models designed for classification and regression problems, and they can yield accurate prediction (1). In a Report published by KENNEDY et al. (2), the authors have demonstrated the application of Bayesian additive regression tree (BART, 3), an extension of the classification and regression tree (CART, 1), to predict executive office elections in over 85 countries with 80% to 90% accuracy. However, we believe that the accuracy of their predictions can be improved by refining the tree models they used.

    Decision trees are well known to best handle the nonlinear and interaction effects between the features and the outcome, but they cannot explain linear effects (4). To model the linear effects between the features and the outcome, a linear model can be incorporated into the terminal nodes. By including linear models in the terminal nodes, the linear effects can be handled by the linear model, while the nonlinear and interaction effects can be handled by the tree structure. Examples of such extensions of decision tree models include logistic regression tree (5), Poisson regression tree (6), and rank-ordered logit tree (4). All these examples are shown to have better predictive power than decision trees. We suggest the authors test whether linear associations exist between the features and the outcome, and test the performance of logistic regression tree if there are linear effects.

    References
    1. L. Breiman, et al., Clas...

    Show More
    Competing Interests: None declared.