Problem 8
/* 8. Create a temporary SAS data set (Random) consisting of 1,000 observations, each
with a random integer from 1 to 5. Make sure that all integers in the range are
equally likely. Run PROC FREQ to test this assumption */
Code:
data a15007.random;
do i=1 to 1000;
x=int(rand('uniform')*5)+1;output ;end;
*here am using rand function to get random value between 1 and 5;
run;
proc freq data=a15007.random;
tables x/missing;run;
Output :
Problem 10
/* 10. Data set Char_Num contains character variables Age and Weight and numeric
variables SS and Zip. Create a new, temporary SAS data set called Convert with
new variables NumAge and NumWeight that are numeric values of Age and
Weight, respectively, and CharSS and CharZip that are character variables created
from SS and Zip. CharSS should contain leading 0s and dashes in the appropriate
places for Social Security numbers and CharZip should contain leading 0s
Hint: The Z5. format includes leading 0s for the ZIP code */
Code:
*Data set CHAR_NUM;
data a15007.char_num;
input Age $ Weight $ SS Zip;
datalines;
23 155 132423222 08822
56 220 123457777 90210
74 95 012003004 78010
;
data a15007.convert;
set a15007.char_num;
*char_num dataset is present in the blog folder uploaded in dropbox folder;
NumAge = input(Age,8.);
NumWeight = input(weight,8.);
*converting character variables weight and age into numeric variables;
CharSS = put(SS,ssn11.);
CharZip = put(Zip,z5.);
*converting numeric variables SS and Zip into character variables;
run;
proc print data=a15007.convert;
run;
Output :
Problem 12
/* 12. Using the Stocks data set (containing variables Date and Price), compute daily
changes in the prices. Use the statements here to create the plot.
Note: If you do not have SAS/GRAPH installed, use PROC PLOT and omit the
GOPTIONS and SYMBOL statements.
goptions reset=all colors=(black) ftext=swiss htitle=1.5;
symbol1 v=dot i=smooth;
title "Plot of Daily Price Differences";
proc gplot data=difference;
plot Diff*Date;
run;
quit; */
Code:
*Data set STOCKS;
data a15007.stocks;
Do date = '01Jan2006'd to '31Jan2006'd;
input Price @@;
output;
end;
format Date mmddyy10. Price dollar8.;
datalines;
34 35 39 30 35 35 37 38 39 45 47 52
39 40 51 52 45 47 48 50 50 51 52 53
55 42 41 40 46 55 52
;
data a15007.difference;
set a15007.stocks;
Diff = Dif(Price);
*using dif function to calculate the difference in thr price compared to the previous value;
run;
goptions reset=all colors=(black) ftext=swiss htitle=1.5;
symbol1 v=dot i=smooth;
title "Plot of Daily Price Differences";
proc gplot data=a15007.difference;
plot Diff * Date;
run;quit;
Output :
Chapter 12
Problem 2
/*2 Using the data set Mixed, create a temporary SAS data set (also called Mixed) with
the following new variables:
a. NameLow – Name in lowercase
b. NameProp – Name in proper case
c. (Bonus – difficult) NameHard – Name in proper case without using the
PROPCASE function*/
Code:
*Data set MIXED;
data a15007.mixed;
input Name & $20. ID;
datalines;
Daniel Fields 123
Patrice Helms 233
Thomas chien 998
;
data a15007.mixed;
set a15007.mixed;
*you can find mixed dataset in the blog folder uploaded in dropbox;
length First Last $ 15 NameHard $ 20;
NameLow = lowcase(Name);
*converting entire word into lower case;
NameProp = propcase(Name);
*making first letter of each work into uppercase;
First = lowcase(scan(Name,1,' '));
*converting entire word into lower case;
Last = lowcase(scan(Name,2,' '));
*converting entire word into lower case;
substr(First,1,1) = upcase(substr(First,1,1));
*converting entire word into upper case;
substr(Last,1,1) = upcase(substr(Last,1,1));
*converting entire word into upper case;
NameHard = catx(' ',First,Last);
*using catx making first letter of each work into uppercase,without using propcase;
drop First Last;
run;
proc print data=a15007.mixed;
run;
Output :
Problem 4
/*4 Data set Names_And_More contains a character variable called Height. As you can
see in the listing in Problem 3, the heights are in feet and inches. Assume that these
units can be in upper- or lowercase and there may or may not be a period following
the units. Create a temporary SAS data set (Height) that contains a numeric variable
(HtInches) that is the height in inches.*/
Code:
*Data set NAMES_AND_MORE;
data a15007.names_and_more;
input Name $20.
Phone & $14.
Height & $10.
Mixed & $8.;
datalines;
Roger Cody (908)782-1234 5ft. 10in. 50 1/8
Thomas Jefferson (315) 848-8484 6ft. 1in. 23 1/2
Marco Polo (800)123-4567 5Ft. 6in. 40
Brian Watson (518)355-1766 5ft. 10in 89 3/4
Michael DeMarco (445)232-2233 6ft. 76 1/3
;
data a15007.height;
set a15007.names_and_more(keep = Height);
Height = compress(Height,'INFT.','i');
*using compress function with "i" argument to remove characters and to ignore cases;
/* Alternative
Height = compress(Height,' ','kd');
*keep digits and blanks;
*/
Feet = input(scan(Height,1,' '),8.);
Inches = input(scan(Height,2,' '),?? 8.);
*using scan function to extract values around the characters from the variable
1 value before space and 2 for value after two for ;
if missing(Inches) then HtInches = 12*Feet;
else HtInches = 12*Feet + Inches;
drop Feet Inches;
run;
title "chapter 12 - problem 4";
proc print data=a15007.height;
run;
Output :
Problem 6
/*6 Data set Study (shown here) contains the character variables Group and Dose. Create
a new, temporary SAS data set (Study) with a variable called GroupDose by putting
these two values together, separated by a dash. The length of the resulting variable
should be 6 (test this using PROC CONTENTS or the SAS Explorer). Make sure that
there are no blanks (except trailing blanks) in this value. Try this problem two ways:
first using one of the CAT functions, and second without using any CAT functions*/
Code:
*Using CAT functions;
*Data set STUDY;
data a15007.study;
input Subj : $3.
Group : $1.
Dose : $4.
Weight : $8.
Subgroup;
datalines;
001 A Low 220lbs. 2
002 A High 90Kg. 1
003 B Low 88kg 1
004 B High 165lbs. 2
005 A Low 88kG 1
;
data a15007.study;
set a15007.study;
length GroupDose $ 6;
GroupDose = catx('-',Group,Dose);
*here we are using catx to supply "-" as a separator between Group and Dose variables;
run;
title "chapter 12 - problem 6";
proc print data=a15007.study;
run;
*Without using CAT functions;
data a15007.study;
set a15007.study;
length GroupDose $ 6;
GroupDose = trim(Group) || '-' || Dose;
*using trim function to trim any space around thr values in Group and
Dose and join them and supply "-" in between the two values;
run;
proc print data=a15007.study;
run;
Output :
Problem 8
/*8 Notice in the listing of data set Study in Problem 6 that the variable called Weight
contains units (either lbs or kgs). These units are not always consistent in case and
may or may not contain a period. Assume an upper- or lowercase LB indicates
pounds and an upper- or lowercase KG indicates kilograms. Create a new, temporary SAS data set (Study) with a numeric variable also called Weight (careful here) that
represents weight in pounds, rounded to the nearest 10th of a pound.
Note: 1 kilogram = 2.2 pounds*/
Code:
data a15007.study;
set a15007.study(keep=Weight rename=(Weight = WeightUnits));
Weight = input(compress(WeightUnits,,'kd'),8.);
*using compress(kd)inside input function to keep numerical values alone from the string
and change if character variables present to numerical;
if find(WeightUnits,'KG','i') then Weight = round(2.2*Weight,.1);
*using find function with "i" argument to remove characters and to ignore cases;
else if find(WeightUnits,'LB','i') then Weight = round(Weight,.1);
run;
title "chapter 12 - problem 8";
proc print data=a15007.study;
run;
Output :
Chapter 13
Problem 4
/* 4.Data set Survey2 has five numeric variables (Q1–Q5), each with values of 1, 2, 3, 4,
or 5. You want to determine for each subject (observation) if they responded with a
5 on any of the five questions. This is easily done using the OR or the IN operators.
However, for this question, use an array to check each of the five questions. Set
variable (ANY5) equal to Yes if any of the five questions is a 5 and No otherwise.*/
Code:
*Data set SURVEY2;
data a15007.survey2;
input ID
(Q1-Q5)(1.);
datalines;
535 13542
012 55443
723 21211
007 35142
;
data a15007.any5;
set a15007.survey2;
array Ques{5} Q1-Q5;
Any5 = 'No ';
do i = 1 to 5;
if Ques{i} = 5 then do;
Any5 = 'Yes';
leave;
end;
end;
drop i;
run;
title "chapter 13 - problem 4";
proc print data=a15007.any5;
run;
Output :
Chapter 14
Problem 2
/*14.2 Using the data set Sales, create the report shown here:*/
Code:
proc sort data=a15007.sales out=a15007.sales;
by Region;
run;
title "Sales ";
proc print data=a15007.sales;
by Region;
id Region;
var Quantity TotalSales;
sumby Region;
run;
Output :
Chapter 15
Problem 2
/*2 Using the Blood data set, produce a summary report showing the average WBC and
RBC count for each value of Gender as well as an overall average. Your report should
look like this:*/
Code:
*Data set BLOOD;
data a15007.bloodnew;
infile 'C:\Users\user\Desktop\sasbook\60864_example\blood.txt' truncover;
length Gender $ 6 BloodType $ 2 AgeGroup $ 5;
input Subject
Gender
BloodType
AgeGroup
WBC
RBC
Chol;
label Gender = "Gender"
BloodType = "Blood Type"
AgeGroup = "Age Group"
Chol = "Cholesterol";
run;
title "ch15-problem2";
proc report data=a15007.bloodnew nowd headline;
column Gender WBC RBC;
define Gender / group width=6;
define WBC / analysis mean "Average WBC"
width=7 format=comma6.0;
define RBC / analysis mean "Average RBC"
width=7 format=5.2;
rbreak after / dol summarize;
run;
quit;
Output :
Problem 4
/*4 Using the SAS data set BloodPressure, compute a new variable in your report. This
variable (Hypertensive) is defined as Yes for females (Gender=F) if the SBP is
greater than 138 or the DBP is greater than 88 and No otherwise. For males
(Gender=M), Hypertensive is defined as Yes if the SBP is over 140 or the DBP is over
90 and No otherwise. Your report should look like this:*/
Code:
*Data set BLOODPRESSURE;
data a15007.bloodpressure;
input Gender : $1.
Age
SBP
DBP;
datalines;
M 23 144 90
F 68 110 62
M 55 130 80
F 28 120 70
M 35 142 82
M 45 150 96
F 48 138 88
F 78 132 76
;
title "ch15-problem4";
proc report data=a15007.bloodpressure nowd;
column Gender SBP DBP Hypertensive;
define Gender / Group width=6;
define SBP / display width=5;
define DBP / display width=5;
define Hypertensive / computed "Hypertensive?" width=13;
compute Hypertensive / character length=3;
if Gender = 'F' and (SBP gt 138 or DBP gt 88)
then Hypertensive = 'Yes';
else Hypertensive='No';
if Gender = 'M' and
(SBP gt 140 or DBP gt 90)
then Hypertensive = 'Yes';
else Hypertensive = 'No';
endcomp;
run;
quit;
Output :
Note : In order to work out the same problems in your own sas editor please find the attached URL where by you can have an access to the problems, the codes and even the data sets used to work upon !!
URL: https://www.dropbox.com/sh/glddfxpmr9ad9sj/AADuwUe4TDpG6CmvddS8p9uka?dl=0
Thanks !