staR - Statistics and Data Analysis powered by R

staR is free software for performing core statistical and data analysis tasks typical of introductory courses in data analysis for the social sciences.

It is primarily designed as a support tool for teaching introductory courses in data analysis for the social sciences, or other social science courses requiring data analysis tasks.

Besides offering an accessible data analysis toolkit (staR is available as an online platform requiring no installation, and its functions are invoked through Stata® commands), it also offers Stata-to-R interoperability, by interactively providing the user with R code snippets corresponding to the Stata® commands she invoked.

© Lorenzo De Sio 2014-2022.

(Stata ® is a registered trademark of StataCorp LP.)

Goals

Learning from personal experience in teaching introductory courses on data analysis, I wanted a data analysis software (and designed staR) to be...

Easy to use

allowing users to focus on key aspects of social science research (understanding social reality; theory and hypotheses, concepts, research design, indicators, choice of appropriate statistical tools) rather than require to invest time and resources in learning a complex programming language, whose power they might not necessarily need;

Easy to be employed in teaching

providing a simple command syntax with simple options, rather than requiring either: (a) to learn and teach complex menu and window systems (looking for options hidden in minor buttons or dialog boxes) or (b) to learn a complex programming language designed for much more than simple data analysis;
also, producing output that is easy and consistent to copy and paste across applications (also facilitating course assignments);

Easy to extend

allowing instructors to extend it to implement more data analysis techniques they may need for their particular course;

Standards-based

helping instructors to provide students with skills that they may reuse, perhaps on professional statistical tools used in the academia and business;

Free (and open source in a near future)

offering a free tool to students (whose often occasional use of statistical tools should not require expensive licenses) and to university departments (removing the requirement of massive license investments for introductory teaching);
providing (in the near future) an open source tool, open to amendments and extensions by the scientific community.

Features

staR addresses the above challenges by providing an online app that allows users to open datasets and perform basic statistical operations by using the following commands, which act as basic functional equivalents of the same commands supported by the commercial Stata ® software.*

use, browse, doedit, drop, keep, generate, replace, recode, rename, tab1, summarize, histogram, tab2, scatter, regress, logit, eststo, esttab, help

by, if

Typing doedit will show a code editor and load a test document showing some of staR's capabilities.

Easy to use interface

Just like Stata and R, staR runs commands interactively and provides immediate output. Such output is (unlike Stata and R) already nicely formatted; it includes graphics in the output flow, and is ready to be copied and pasted into other documents.

Code editor

staR also offers a code editor for writing and running complex scripts.

Built-in Stata-to-R translation

staR is inherently based on a set of "translation" scripts that generate R syntax corresponding to the user-typed Stata-like syntax, which is then executed under the hood by an R server. A key by-product of this process is that such generated R syntax is available to the user. This can be useful in teaching R equivalent syntax of common operations, leveraging common R packages, and thus also easing a possible transition to the more complex R programming language. R code snippets are available directly by clicking on the R button included in each command output; a full R translation of the entire session is also accessible through the menu interface.

staR for course instructors

staR, which is ideal (and was designed) for introductory courses in data analysis, is already available for course adoption at no cost.

However, staR is not open source yet, and you should contact me for info on how to evaluate the software, how to access a live demo, and how to deploy it in a production environment using standard university IT infrastructure (staR is available as a Docker image which can be hosted on any IT infrastructure hosting Docker containers).

NOTE: users connecting to a staR server can currently only use Stata-format datasets located at publicly accessible URLs; you will have to set up accordingly the datasets needed for your course. Developing the access to user-supplied datasets is under consideration.

NOTE 2: a staR server consists of an R instance living in a Linux server. As R is single-threaded, this means that a large number of concurrent users might lead to server overload (although our class experience has shown that classes of 30-50 students with moderate use simply run fine). Before using staR with large classes, you are strongly advised to run multi-user performance tests. In any case, performance problems can be easily solved by setting up multiple servers; you can then assign students to different servers, or rely on more sophisticated tools (e.g. automatic multiplication of server instances, along with automatic load balancing) which are ordinarily available in Docker hosting environments.

staR architecture and extensibility

staR is powered by R. In particular, it provides a web-based (HTML, JavaScript) interface that:

As a result, staR is easy to extend. New commands (meant as functional equivalents of their Stata counterparts) can be easily added by simply adding JavaScript scripts receving Stata syntax and providing a corresponding R translation.

Given the power of R, the implementation of a command is in most cases a matter of selecting the appropriate package and just adding some translation and integration code. This allows any instructor with basic R knowledge to extend staR for their specific teaching purposes.