Top 8 R Machine Learning Projects

Project mention: Using dictionaries to check text in R, language processing  reddit.com/r/rstats  20210818
TextFeatures can be used to create a summary table of occurrences of text features like number of unique words, etc. Don't know if it does nouns or verbs.

Project mention: [D] Selecting Hyperparameters Using Bayesian Optimization  reddit.com/r/statistics  20210303
Disclaimer: I am the maintainer of ParBayesianOptimization. That readme has a pretty good walkthrough of how Bayesian optimization works.



I developed miceRanger because the mice package uses a really slow implementation of random forests. It has a bunch of plotting capabilities and can impute new datasets without retraining the models used in the mice procedure.

lmtp
:package: Nonparametric Causal Effects of Feasible Interventions Based on Modified Treatment Policies :crystal_ball:
Project mention: [Q] Should Gmethods, IPTW always be used over traditional regression?  reddit.com/r/statistics  20210912The tlverse/sl3 super learner library is much better integrated and a lot more powerful (a bit more complicated in the beginning but once you understand it, its great). LMTP has a separate branch that uses sl3: https://github.com/ntwilliams/lmtp/tree/sl3devel. To specify formulas is sl3, you just do Lrnr_glmnet$new(formula = ~ 1 + W + A + A*W), but make sure to download the "dev" version: devtools::install_github("tlverse/sl3", ref = "devel").

tmle3mopttx
🎯 💯 Targeted Learning and Variable Importance for the Causal Effect of an Optimal Individualized Treatment Intervention
Project mention: [D] Is there a such thing as "Prespective Statistical Models"?  reddit.com/r/statistics  20210921This package and the references therein allows for nonparametric estimation and inference for the optimal dynamic treatment: https://github.com/tlverse/tmle3mopttx.

causalglm
Interpretable and modelrobust causal inference for heterogeneous treatment effects using generalized linear working models with targeted machinelearning
Project mention: [Q] Sensitivity of (Causal) Inference to Nonlinear Functional Form  reddit.com/r/statistics  20210928Why not both? https://tlverse.org/causalglm/ (Will replace this with a more informative comment when I have free time later today)

Project mention: ggshakeR  R’s first allinone data analysis and visualization package on open soccer data  reddit.com/r/rstats  20211018
