## Ethics data set(s)

### Description of data

Data collected in Stat 226 class, U of Chicago, 1/7/97.
Each student was to write down two letters on an index card, reflecting:
• Sex (M or F)
• Judgment of which of two ethical violations is more severe (B or N)
The ethical violation scenarios considered were:
• B: Accepting hundreds of thousands of dollars in campaign contributions from foreign nationals with an interest in particular foreign policies, by an individual in a position to influence those policies.
• N: Accepting hundreds of thousands of dollars in tax-exempt donations, and then using those funds to promote political candidates (a non-exempt activity).
Both scenarios were purposely selected to be hypothetical.

### Results

The results of the survey are displayed in the following contingency table:
```           | Sex
Scenario |         M          F |     Total
-----------+----------------------+----------
B |         9          5 |        14
N |         7          7 |        14
-----------+----------------------+----------
Total |        16         12 |        28
```

The table above was produced by one of Stata's immediate commands. These commands do not perform calculations on a data set that has been read into Stata already; instead, it performs calculations on a data set that is supplied on the same line as the command. The "tabulate immediately" command that generated the table above is: ```tabi 9 5 \ 7 7```.

### Computer input and Stata analysis

The data can be arranged for computer input in several ways, of which the most common are indicated below:
• A single line for each individual studied, with the sex and ethical choice for that individual. This data set has 28 lines, one for each person responding to the survey. The data are coded, which means that numbers are used to represent levels of the categories. For ethical choice, 1=B and 2=N, and for sex 1=F and 2=M. [e0.raw]

When the data are in this form, we use the infile command in Stata to read the information in the file. The data set contains two columns this time, which we label ethics and sex. We also need to tell Stata the name of the file we are reading.

This data set can be read in and the table above can be produced by the following pair of Stata commands:

infile ethics sex using e0.raw, automatic
tabulate ethics sex

• This version of the data set also has one line per individual, but instead of using numbers to code the responses, B and N are used for the ethical choice and M and F for sex. [e1.raw]

When the data are in this form, we must modify the infile command, since Stata assumes that values for variables are numerical. What we do is to request Stata to assign numbers to the various levels of the variables, and then to use our codes as labels (instead of using the assigned numerical codes). For the ethics variable, we ask Stata to create a set of labels called elbl, and for sex, the labels will be called slbl. Finally, we need to tell Stata that these labels don't exist yet, and we would like Stata to create them for us automatically. This data set can be read in and the table above can be produced by the following pair of Stata commands:

infile ethics:elbl sex:slbl using e1.raw, automatic
tabulate ethics sex

• A single line for each possibility (BM,BF, BM, NF), indicating on each line sex, ethical preference, and number of individuals in that cell of the contingency table. This data set has four lines in all. [e2.raw]

When the data are in this form, we use the infile command in Stata to read the information in the file. The data set contains three columns, which we label ethics, sex, and number. We also need to tell Stata the name of the file we are reading.

This data set can be read in and the table above can be produced by the following pair of Stata commands:

infile ethics:elbl sex:slbl number using e2.raw, automatic
tabulate ethics sex [weight=number]

• A single line for each sex, indicating the total number of that sex, together with the number selecting ethical choice "B". This data set has two lines. [e3.raw]

### Estimating joint, conditional, and marginal distributions

Here is some output from Stata:
```. tabi 9 5 \ 7 7, cell row col chi2

| col
row |         1          2 |     Total
-----------+----------------------+----------
1 |         9          5 |        14
|     64.29      35.71 |    100.00
|     56.25      41.67 |     50.00
|     32.14      17.86 |     50.00
-----------+----------------------+----------
2 |         7          7 |        14
|     50.00      50.00 |    100.00
|     43.75      58.33 |     50.00
|     25.00      25.00 |     50.00
-----------+----------------------+----------
Total |        16         12 |        28
|     57.14      42.86 |    100.00
|    100.00     100.00 |    100.00
|     57.14      42.86 |    100.00

Pearson chi2(1) =   0.5833   Pr = 0.445

```

9-Jan-97 RT